MOLECULES 

AND 

THOUGHTS 



Yuri Tarnopolsky 






















MOLECULES 

AND 

THOUGHTS 



by Yuri Tarnopolsky 


2018 




CONTENTS 


Introduction 4 

1. Molecules and Thoughts 9 

2. APPENDIX to Molecules and Thoughts 117 

3. Transition state in patterns of history 134 

4. Tikki Tikki Tembo: The Chemistry of Protolanguage 184 

5. Pattern Theory and "Poverty of Stimulus" argument in linguistics 304 

6. The Chemistry of Semantics 361 

7. Salt: The Incremental Chemistry of Language Acquisition 387 

8. Salt 2: Incremental Extraction of Grammar by Simplistic Rules 423 

9. Ideogram: A Simpleton in a Complex Family 459 

10. Do Piraha speak Nean? 483 

11. The Rusty Bolts of Complexity: Ideograms for Evolving Complex System 493 

12. M ATLAB codes for topological analysis of text and grammar extraction 544 

13. Pattern Chemistry of Thought and Speech and their Hypothetical Ancestor 565 

14. APPENDIX to Pattern Chemistry of Thought and Speech 613 

15. Pattern Chemistry of Language 634 

16. Pattern Chemistry of the Origin of Mind 


639 



INTRODUCTION 


This file comprises the stages of my search for generalization of the chemical paradigm over real world 
systems of high and evolving complexity, including human mind and society. 

Atoms selectively bonds with each other. Some bonds are stable, others weak or negative, which means 
mutual repulsion. Molecules can be stable, unstable, or practically impossible. As soon as various molecules 
are brought into contact, the bonds between atoms start a process of rearrangement into another set of 
molecules with lower total energy and higher stability. This is the core of chemical paradigm. Chemistry is 
about not just structures, but mainly chemical events, quite as our history is about events on planet Earth: 
events with beginning and end, consisting of smaller events. For a human observer, events are mostly known, 
but some are new, immediately turning into old. History displays the evanescent novelty of events but an 
extreme frugality with patterns. Patterns are variables of the algebra of natural complexity. 

Unlike mathematical symbols and equations, chemical reaction runs in real time, sometimes instantaneous, 
sometimes over years. Chemical event at the moment of contact is in its initial state. The final state comes 
after some short, long, or indefinite time. Yet the speed of transformation depends on what is there between 
the initial and final states: the transition state. The latter is fleeting, unstable, irregular, turbulent, and against 
the rules. It is a barrier on the way to the final state and it requires additional energy to cross it over. It is the 
bottleneck of transformation and its stability defines the speed or probability of transformation. Physics is 
about processes and it does not care who pulls the trigger and who falls to the ground, while history is not 
about the bullet trajectory but about the victim of a murder and the shooter: it is about events. 

Events in real world are similar to chemical events: 

1. Fast revolution, political or technical, between two static periods. 

2. American presidential campaign, for example between Obama and Trump terms. 

3. Intellectual discovery between formulating a problem and coming to solution. 

4. Correction of stock market or recession between two bull runs. 

5. War between two periods of peace. Political crisis out of nowhere. 

6. Process of decision-making after a package of new data. 

7. Period between perception and understanding or recognizing a text or a face. 

8. All processes of evolution are long-lasting sequences of events and so is individual human life. 



Pattern Theory of Ulf Grenander provides a universal apparatus for representing 
states and events in real world and their reflections in human mind. It also 
provides a measure of energy or probability for configurations of atomic entities 
and of their regularity. It is the method of representing everything in terms of its 
interconnected constituents, i.e., figuratively, as points and lines, or, more 
realistically, as atoms and bonds. Its roots go back to atomism of Lucretius and 
structuralism. 

Unlike symbols in mathematics, Grenander's generators, configurations, and 
patterns (classes of configurations) possess some properties of material 
phenomena, for example, probability, stability, and regularity, and yet are 
applicable to both fleeting thoughts and tenacious concepts in the mind. In a way, 
it is materialized thought. 

Individual human life is also an event. It begins and it ends. Ulf Grenander and I 
completed our experimental "History as Points and Lines," but it was already too late to go together into 
pattern kinetics and irregular configurations. Whatever nonsense is there in my files, it is entirely my own. 

This yet undiscovered continent of knowledge still waits for somebody with worldwide interests, thirst for 
novelty, and reasonable distance from the final state. 

Probably, my most significant personal realization, inspired by Pattern Theory, was the principle of natural 
(not artificial I) evolution of such very complex systems as life, mind ("intelligence"), and language from utterly 
simple beginnings: 

Complexity naturally, i.e., without human engineering, evolves from 
simplicity by simple and therefore probable steps. 

This principle shows why there is a chasm between natural and artificial intelligence and how it can be bridged 
with not yet predictable consequences. I would call the bridge Artificial Natural Intelligence, ANI. Until then, Al 
has enormous potential of arming humans against each other, while ANI might just expand humanity. Thus, 
our children are already natural products of artificial upbringing. 

I believe that Pattern Theory builds for the first time the long-sought bridge between sciences and humanities. 
As somebody living in both elements, I express my amphibian personality in the style of my texts. 

I offer my apology for my imperfect English. My life is but an event, too. 

Sapienti satis. 

Yuri Tarnopolsky 



Ulf Grenander, 
1923-2016 


2018 





1 


Molecules and Thoughts 

Pattern Complexity and Evolution in Chemical Systems and the Mind 


Yuri Tarnopolsky 


January, 2003 


SUMMARY 


Pattern Theory is representation of complexity in terms of atom-like blocks and bonds between 
them, similar to chemical structures. The paper attempts to look at chemistry from pattern 
perspective, “pattemize” some general ideas of chemical kinetics, and import them into the 
pattern representation of the mind. In particular, it is intended to complement the recent Patterns 
of Thought by Ulf Grenander [1], which is the main frame of reference for this paper. The 
chemical concepts in question are: chemical complexity, transition state, catalysis, non¬ 
equilibrium systems, competition and selection of chemical species, and molecular evolution. 
Distinction is made between Ar-complexity (Aristotelian), which displays in the fixed generator 
space, and He-complexity (Heraclitean), which displays in the expanding generator space. The 
mind is regarded as an expanding configuration space, with the topology of a subset of the scale 
of sets (Bourbaki), where configurations compete for presence in consciousness. The concept is 
illustrated with computer simulations of building a He-system and spontaneous activity on some 
connectors. 



2 


Abbreviations: 

LMS: Life, Mind, Society; AI: Artificial Intelligence; NI: Natural Intelligence; Alife: Artificial 
Life; PT: Pattern Theory; Ar-System: A system in a fixed generator space (Aristotelian); He- 
System: A system in a changing generator space (Heraclitean). 


CONTENTS 


Introduction 3 

1. Chemical complexity 7 

2. Molecular patterns 14 

3. Regularity and probability 

in chemistry 20 

4. Thoughts and molecules 27 

5. Stability in LMS 31 

6. Patterns of transformation. 36 

7. Chemical kinetics and 

transition state 39 

8. Catalysis 50 


9. Competition and selection of 


configurations 

57 

10. The competitive mind 

67 

10.1. BIRDS 

70 

10.2. PROTO 

76 

10.3. Discussion 

87 

11. SCALE 

90 

Conclusion 

100 

References 

105 





3 


Introduction 


A lecturer tells some students to learn the phone-book by heart. 
The mathematicians are baffled: By heart? You kidding?' 

The physics-students ask: 'Why?' 

The engineers sigh: 'Do we have to? 1 

The chemistry-students ask: 'Till next Monday?’ 

source: http://www.talisman.org/~erlkonig/humour/science-jokes 


Pattern Theory (PT) is a mathematical representation of objects with large and incompressible 
complexity [2, 3, 4]. Some of the complex objects are static, as for example, telephone directory, 
others are natural and artificial dynamical systems, among them, life, mind, and society, for 
which the abbreviation LMS will be further used without distinction between natural and 
artificial. In the long run, a telephone directory evolves, too. 

There is an obvious conceptual isomorphism between PT and chemistry. Both share the 
same principles of atomism, bonding, and transformation. Molecules, forms of life, thoughts, and 
social structures are typical and frequent illustrations of pattern analysis and synthesis in Ulf 
Grenander’s works on Pattern Theory [2, 3, 4]. The parallel with chemistry is widely used in [1], 
for example: 

Note the resemblance to chemistry: generators correspond to atoms, configurations (ideas) to 
molecules, and bonds to bonds ([1], 2.4: Regularity of Mind States: Conformation of Ideas). 



4 


A chemist can easily recognize in PT a kind of meta-chemistry. Molecules look and 
behave like configurations and they are configurations in the eye of a chemist at least 
superficially familiar with PT. 

The similarity between the mind in Patterns of Thoughts [1] and a chemical system is 
especially conspicuous because of dynamic aspects. To an imaginative chemist, Ulf Grenander’s 
representation of the mind may look like a Faustian apparatus, where the content is brewing in a 
blend of order and spontaneity, out of which Golem, instead of Homunculus, is about to jump 
out. To decide whether the parallel with chemistry is just a metaphor, we need to look at 
chemistry from the point of view of pattern meta-chemistry. 

The science of chemistry is about 150 years old, but only during the last 70 years, which 
is comparable with the 50 years of Artificial Intelligence (AI), chemistry has developed a series 
of concepts, approaches, and subjects that give it the modem shape. A set of abstract ideas of 
chemical origin, known as Artificial Life [5], a mathematical discipline of significant generality, 
is among the developments of the last few decades. 


Thinking about why Roomba [6], the recent “intelligent sweeper vac”, the descendant of turtle 
CORA (Conditioned Reflex Analog) built by W. Grey Walter [7] in Great Britain in the 40’s, still 
has a too limited intelligence for such a simple job as sweeping a floor, one may suggest that a 
little bit of life, however artificial, would not hami intelligence. 


In this preliminary paper, written from the meta-chemical (i.e., pattern) perspective, the 
relevance of Artificial Life (Alife) for AI and NI is in the focus of attention. Some possible 
applications of generalized and hybridized pattern-chemical ideas to social sciences are also 
briefly considered. 

While it may appear questionable what kind of input AI can have from chemistry, Ising 
model is an excellent example of importing a general idea from “hard science” to AI. 

Ernst Ising, following ideas of Wilhelm Lenz, worked in the narrow area of 
ferromagnetism. Thinking on the reasons why his idea has spread over such vast areas, including 



5 


AI and Alife, a chemist could note that the model came from the area of condensed matter, i.e., 
structured systems, far removed from the traditional chaotic gas-like ensembles of independent 
particles. Condensed matter is an elementary “natural” case of probabilities on structures where 
interaction is defined on a meaningful topology. Ising’s initial object, a linear spin model, was, 
actually, a stochastic cellular automaton long before the birth of this tenn. 

Life, mind, and society (LMS), apparently, belong to the same large class of condensed 
systems as ferromagnetics: the statistical ensemble is constrained by a structure, and this is what 
Pattern Theory is about. A cardinal difference between the condensed matter of physics and that 
of LMS is the non-equilibrium thermodynamics of the latter, combined with a quasi-solid 
medium for information storage (Erwin Schrodinger: “life is aperiodic crystal”). Besides, the 
variety of connectors in chemistry and LMS is exceptionally wide. From this point of view, 
some new areas of chemical experience, natural and artificial life in particular, can present a case 
for importing them, in a generalized pattern form, into pattern theory. As a sample of this import, 
genetic algorithms in pattern recognition can be mentioned [8]. The current trend is to map 
dynamical systems of AI and Alife onto the area of science of complexity as foundation. 

Modem chemistry consists of a large number of separate fields and means different 
things for different people. In this paper, chemistry is meant to deal with molecular structure, 
addressing the following three problems: 

1. Analysis: Reconstruct an unknown configuration from a set of its transformations into 

known configurations. 

2. Synthesis: Given the initial and final configurations, design the shortest sequence of 

transformations from one to the other. 

3. Reactivity: Given an initial configuration Ci at time ti, predict the most probable 

configuration C 2 at time t^. 

Organic chemistry is in charge of all three problems, as well as of configurations of 
highest known complexity. It looks like a kind of a theater where some chapters of PT are played 
live. 

The following Table lists parallels between chemistry and PT. In pattern symbolism, the 
concepts of transition state, catalysis, replication, and competitive selection would not be tied to 



6 


any specifically chemical substrate. It is an intent of this paper to fill, in a very general form, the 
empty blanks in the right column. 


Table: Comparison of chemistry and PT 



Chemistry 

Pattern Theory 

1 

Atoms 

Generators 

2 

Bonds 

Bond couples 

3 

Molecules 

Configurations 

4 

Types and classes 

Patterns 

5 

Reaction 

Transfonnation 

6 

Concentration 

Probability of configuration 

7 

Energy E ~ E t 

Energy E ~ log P j 

8 

i 

Transition state 

i 

9 

Catalysis 


10 

Replication 


11 

Molecular Evolution 



It is important to forewarn that the description of chemistry in this paper, addressed to 
non-chemists, is very simplistic. Although this paper is a work of a chemist with life long 
interests in AI and NI, for whom comparison of molecules with thoughts has been quite natural, 
author’s knowledge of mathematics, AI, and A-life is superficial and fragmentary. The paper is 
an invitation to a more professional bridge-building from the other bank. 




7 


1. Chemical complexity 

Judging by the volume of publications, chemistry is the largest separate body of scientific 
knowledge. The combinatorial complexity of molecules is enormous. Chemistry studies 
transformations of configurations of unlimited size on the set of about one hundred basic 
generators of the Periodic Table. The most common generators have arity from 1 to 4. There is a 
cornucopia of connector graphs, among them cycles and bridged cycles that play the role of 
stable atom-like subconfigurations. There are also an infinite variety of text-like linear 
sequences. 

The Chemical Abstracts Registry (CAS Registry) counts practically each known 
substance and chemical transformation. Their numbers are constantly growing. Figure 1.1 
presents a typical report, constantly updated at [9]. On January 2, 2003, there were 20,809,353 
organic and inorganic substances, 24,206,720 sequences, and 6,559,626 single- and multi-step 
reactions. 

Each substance or transfonnation can be described in numerous publications and some of 
them are subjects of thousands of papers. The number of recorded chemical species, however, is 
but a small subspace of the entire chemical configuration space. 

For comparison, all human thoughts, in the fonn of sequences of symbols, form yet 
uncounted but, probably, an equally staggering sequence space. Some mental configurations, for 
example, Ising model and Shakespeare’s Hamlet, are subjects of countless publications in 
sciences and humanities respectively. 


8 


Chemical information is as incompressible and devoid of generality as a telephone 
directory. In addition, chemistry has a limitless craving for intimate details. Chemistry consists 
mostly of painstakingly described facts observed at certain conditions. The facts belong to two 
types: unitary structures and binary transformations. A novice in chemistry is often overwhelmed 
by the apparent absence of any rigorous theory of a mathematical nature, but this can be stated 
about works of Shakespeare, too. 


CAS is the leading provider of organic, inorganic, 
and biosequence substance information. 


st CAS Registry Number and Substance Count 


Date 

Mon Mar 4 10:08:41 EST2002 

Count 

19,417,310 organic and inorganic substances 


17,601,341 sequences 

Total 

37,018,651 chemical substance registrations 

CAS RN 

397841 -96-2 is the most recent CAS Registry Number 



Figure 1.1 Example of the CAS Registry count 


Chemistry provides a good illustration to the distinction between configuration and image 
in PT, see [2], p. 52 and [3], p. 91. Chemical configuration is an abstraction. Its images are 
conformations with similarity transfonnation: rotation around single bond, Figure 1.2 A, 
wherever this rotation is possible. Chemists usually ignore confonnations, unless they are 
relatively stable, and in biochemistry they often are. 

Confonnations, in turn, are abstractions, too. The images of a conformation have 
similarity transformation: stretching and/or bending (changing the distance and angle between 
bonds), Figures 1.2 C and B. The deformations are observable in molecular spectra. Mostly, but 
not exclusively, protein folding is a process of going through a sequence of conformations, 
complicated by various interactions. 












9 


At the level of configuration, chemistry can be as much reduced to statistical averages as Hamlet 
to the statistics of its word usage. This circumstance is often another source of distress for a 
student of chemistry with an analytical frame of mind, but a source of delight for dedicated 
chemists with a more romantic attitude to the world where it is individuality that matters. 



Figure 1.2 Bond deformations: A: Rotation, B: bending, C: stretching. 


The type of combinatorial complexity that chemistry displays deserves a closer look. 

There are two different ways toward a combinatorial explosion. The obvious way is 
expanding the base set. Another way was pointed to by Bourbaki in Theory of Sets [10, p. 259] 
and called the scale of sets. 

In the following excerpt, tP , substituted for the original Gothic letter P, denotes a set of 
subsets. 


1. Given, for example, three distinct sets E, F, G, we may fonn other sets 
from them by taking their sets of subsets, or by forming the product of one of 
them by itself, or again by forming the product of two of them taken in a certain 
order. In this way we obtain twelve new sets. If we add these to the three original 
sets E, F, G, we may repeat the same operations on these fifteen sets, omitting 
those which give us sets already obtained; and so on. In general, any one of the 
sets obtained by this procedure (according to an explicit scheme) is said to belong 
to the scale of sets on E, F, G as base. 





10 


For example, let M, N, P be three sets of this scale, and let R { x, y, z } be 
a relation between generic elements x, y, z of M, N, P, respectively. Then R 
defines a subset of M x N x P, hence (via a canonical correspondence) a subset of 

(M x N) x P, i.e., an element of ^(M X N) X P). 

Thus to give a relation between elements of several sets in the same scale 
is the same as to give an element of another set in the scale. Likewise, to give a 
mapping of M into N, for example, amounts (by considering the graph of this 

mapping) to giving a subset of M X N, i.e., an element of ^(M X N), which is 

again a set in the scale. Finally, to give two elements (for example) of M amounts 

to giving a single element in the product set M X M. 

Thus being given a certain number of elements of sets in a scale, relations 
between generic elements of these sets, and mappings of subsets of certain of 
these sets into others, all comes down in the final analysis to being given a single 
element of one of the sets in the scale, [10], p. 259. 


Here the combinatorial base set is constantly expanding because each new combination 
enters the set as a new element. Therefore, the number of initial elements can be quite small, as 
the following sequence, built of two symbols, but potentially infinite, illustrates: 


{(AB) (AB) [(AB) (AB)](BA)}{(AB) (AB) [(AB) (BA)](BA)}... 


The treatise by Bourbaki starts with describing mathematics as combinations (called 
assemblies ) of letters and signs and fits prophetically well the structure of both chemical space 
and space of ideas. The letters correspond to generators and signs, including brackets, to bonds. 
Thus, not only mathematical formulas, but also sentences like “every finite division is a field” 
[10, p. 1] are listed as assemblies. 



11 


From the point of view of a chemist, the static chemical structures are no less assemblies 
than formulas and phrases because linearity of connector is not among Bourbaki’s conditions and 
any connector graph is a sign. 


It is a chemist’s impression that both chemistry and PT go further by introducing 
measures on “assemblies” space, which leads to a thermodynamics of a kind. 

The difference between the two ways of complexification, which could be referred to as 
conservative and progressive, is that in the conservative combinatorial space the elements do not 
have copies, while in the progressive one the elements have multiple entries into the 
combinations, which is the case with chemistry and ideas. New clusters of ideas get a new name 
or a new meaning of the old name and thus expand the base set recorded in dictionaries. This 
process is examined in detail in Patterns of Thoughts ([1], 3.3), where generalization creates 
macrogenerators and encapsulation adds a new idea to the envelope of the mind: 


ENCAPSULATION: idea -» env (MIND) 


Note, that specialization can create new generators, too, when a new species of a 
taxonomic unit is discovered. 

The process of generation of new ideas leads to progressive combinatorial explosion, 
unless there is a counteracting factor, which will be considered later. Without dissipating or 
depleting the envelope, combinatorial explosion might take over and paralyze the mind, with 
zero probability of retrieving any idea into content, which happens with Web sites during a 
massive “denial of service” attack, sometimes, spontaneously. While conservative combinatorial 
space is fully defined by the base set and the combinatorial operation, the progressive space must 
be generated in the process of evolution consisting of a series of choices. This is just a way to 
say that a progressive space has a history. 



12 


Whether we consider chemistry or AI, knowledge at any given moment has nothing to do 
with the history of the area. Nevertheless, the work of the collective mind has a historical 
dimension reflected in CAS registry, citation indexes, and in plain references to previous works 
that allow for reconstruction of history. In social structures, as well as in any individual human 
mind, however, history can have an enonnous bearing on both popular mindset and individual 
character with its prejudices and motivations, Freud or no Freud. 

The chemical configuration space is oppressively cumulative: each new structure remains 
on record. Against all odds, however, modern chemistry can successfully navigate its own real, 
not virtual, complexity with a modest number of tools of high power. This is why chemical 
experience might be relevant for handling other natural systems of towering complexity. 


Whether it is fortunate or not that we still do not have the same power of simplification in LMS, 
is a matter of personal philosophy. 


Since terms conservative and progressive are overloaded with connotations, to 
distinguish between two kinds of combinatorial complexity, terms Ar-complexity (Aristotelian) 
and He-complexity (Heraclitean) are suggested. The first one is defined by the fixed base set or, 
in terms of PT, fixed generator space, while the second one builds up as a mapping onto the scale 
of sets. In tenns of PT, it would sound as expanding generator space. 

It has always been among major scientific goals to work on well-defined structures, 
which is, probably, the main reason for the rift between sciences and humanities. He-complexity 
could be a way to mathematically accommodate history of individual life, biological species, 
ideas, and nations by reflecting the distinction between old and new. 


According to Bourbaki’s terminology, any scale of sets built according to explicit instructions, 
belongs to the scale of sets on a particular basis. Taking some liberties, we will use term scale of 
sets for this kind of a partial scale, too. 



13 


Chemistry serves as an example of He-complexity because almost every chemical 
structure can be combined with another and given a new name and a new CAS number, which 
would allow for tracing the evolution of chemical space. A similar kind of complexity can be 
found in the development of algorithms by combining subroutines into new standard blocks. Any 
block is always younger than its constituents and thus the evolutionary axis is established 
without the physically explicit variable of time. 


Remarkably, Pattern Theory in its current state does not need any modification to accommodate 
both Ar- and He-systems. It is inherently fit to describe patterns of history, to which physical 
sciences have been traditionally blind. 


A different example is the scientific citation space, where a point refers down to previous 
points and is referred down to in subsequent citations, which is similar to the levels along Ulf 
Grenander [1]. Notably, the citation space, is not a hierarchy in the sense any taxonomic space is: 
all citations are equal and refer to real works. The citation network is partially ordered and it 
establishes a direction of natural historical time. 


One may wonder whether the gigantic molecules of DNA are also organized as the scale of sets, 
coding the phylogenesis of the species as well as the basics of ontogenesis. 


We return to this idea later because it seems to be of utmost importance for the structure 
of working mind. 


14 


2. Molecular patterns 


It may look on the surface that chemistry lacks an explicit concept of pattern. Nevertheless, the 
concept of pattern is deeply ingrained into chemical thinking and is one of the most efficient 
ways to manage the unbearable burden of chemical complexity. 

The initial ideas of chemistry about structure had a lot of pattern spirit. According to 
theory of types of Auguste Laurent (1807-1853), molecules were supposed to fall into certain 
structural patterns, called types, for example, of water and ammonia. 

The type of water is A—O—B , where O is oxygen and A and B can be any 
combinations of atoms, among them: 


H—O—H CH 3 —O—H 


water 


methanol 


C 2 H 5 —O—C 2 H 5 etc. 
ethyl ether 


15 


Here the similarity transformation is substitution of other atomic subcon ligurations for H 
in water. In this sense, chemical patterns are simply taxonomic units of classification of 
molecules and their transfonnations. Certain chemical subconfigurations (called functional 
groups) usually display a common behavior and are only partially influenced by their structural 
environment. 


The type of ammonia A—N—B 

C 


-N—H 

CH 3 —N—] 

H 

c 2 h 5 


comprises amines: 


etc. 


Theory of types gave a strong impetus to chemical theory and already by 1865 the 
modem concept of chemical connector, in PT terms, was ready to hatch. 

Up to present, the basic organic chemistry has been taught to students by types of 
structure: alcohols, amines, sugars, proteins, etc. The starting chapters are typically entitled as: 

Alkanes and Cycloalkanes 

Alkenes 

Alkynes 

Alcohols, Diols, and Ethers 

etc., i.e., by patterns of chemical structure, without too much overlapping . 


Looking at a big and complex chemical fonnula, a novice in chemistry is immediately 
lost, while a more experienced chemist perceives them in terms of known types of 
subconfigurations. Later, some connector graphs were added to the intuitively defined types, 
such as the alternating single and double bonds 


.. .A—A = A—A = B—A = A... 



16 


as well as more complicated ones. 

For example, trying to understand formulas in Figure 2.1, a chemist first suspects the 
pattern of steroids in the connector and then looks at the neighborhoods of atoms. 



Figure 2.1 Steroids and their connector 

Similarly, the connector of porphin defines a pattern to which such important substances 
as heme and chlorophyll belong, Figure 2.2. 


rH = rrr ru 



heme Poip hin connector 


Figure 2.2 Heme and its porphin connector 

The chemist perceives an unfamiliar structure in tenns of standard superatomic blocks 
and their connectors. Similarly, the fonnula of an unknown substance is reconstructed block by 
block, according to their patterns of behavior and topology. Two blocks are, possibly, neighbors 
if they belong to a larger block and can be found in the products of its decomposition. 



17 


Most practically important molecules belong to several patterns at the same time. Taking 
to account the interaction between different groups within the same molecule, the concept of 
pattern in chemistry, clearly visible from a distance, is significantly blurred at a close range. No 
doubt, however, that a student of chemistry learns to perceive the immense complexity of the 
molecular world in terms of patterns that look like mathematical fonnulas where symbols stand 
for types of structure and transfonnation. Not accidentally, various shortcuts for large blocs are 
popular. Thus, R in R—O—H can stand for 61 atoms (C 20 H 41 ) with a very intricate connector. 

In the chemical formula of aspirin, Figure 2.3, we find at least three stable 
subconfigurations from countless other structures: wt/io-substituted benzene nucleus (ortho 
means side-by-side), carboxyl, and acetoxy-group. Aspirin, therefore, displays at least four 
patterns: 

1. Aromatic compounds, i.e., derivatives of benzene; 

2 . carboxylic acids; 

3. esters; 

4. ortho-substituted aromatic compounds, i.e., derivatives of benzene with two 

substituents side by side. 




Figure 2.3 Overlapping patterns of aspirin 





18 


Any molecular structure can be converted, by standardized international rules, into a 
string of letters from which the structure can be drawn. Thus, aspirin is 2-(acetyloxy)-benzoic 
acid, which would look differently in Russian, but have the same meaning. This correspondence 
resembles, in inverse order, the relation between the text of Hamlet and its stage production. 

Infonnation contained in a chemical formula can be coded by a matrix of bonding, as the 
example of formaldehyde in Figure 2.4 shows. Aspirin, C 9 H 8 O 4 , requires a 21x21 matrix. 
Matrices for all large molecules are sparse. 



Figure 2.4 Two representations of the structure 
of fonnaldehyde 

In dynamic representations of configurations, the connectivity matrix can contain 
probabilities or affinities instead of just Boolean incidence. Moreover, a multidimensional array 
can comprise all essential aspects of the configuration, including its history, if the configuration 
is a scale of sets. Thus, the diagonal can store the data for generators and other elements can 
specify types of bonds: single, double, etc. This method is actually used in computer coding of 
chemicals. 

Chemistry as a testing ground for pattern ideas is interesting also from another angle: 
perception and understanding of images. 

The chemist deals with very complex and large chemical fonnulas that must be 
understood. The process of understanding of a chemical formula consists of compiling (partly, 
subconsciously) a list of all its patterns, as in Figure 2.3. The verbal description and writing the 
exact chemical name of the substance is the final proof of understanding because it can be 



19 


shared with another chemist, in accordance with the old saying that the best way to understand 
something is to explain it to somebody else. 

Similarly, the understanding of a picture consists of listing the object in the picture in 
such a way that it can be shared with another person who cannot see the picture itself. For 
example: “I see two birch trees on a grassy foreground against the background of a conifer 
forest.” This renders not the picture but its pattern. 

Human understanding, in a way, is a social phenomenon. As soon as the components of 
an image and their spatial relation are identified in such a form that can be shared, the image, 
whether a picture or a chemical fonnula, is understood. Unlike the natural objects that must be 
subjected to pattern analysis before they can be processed further, for example, in computer 
vision and recognition, chemical formulas are “turnkey ready” because they are already naked 
patterns. 


20 


3. Regularity and probability in chemistry 


Chemistry is a very liberal science in the sense that it has few, if any, strict rules without 
exceptions. It constantly discovers something that was unthinkable before and is always on the 
prowl for monsters and chimeras. 

To give an example, it had been an axiom that noble gases were unable to form 
compounds, until in the 1960’s the axiom was shattered. This puts a teacher of chemistry on a 
shaky ground because whatever the students say may be true. 


When the authors of this book were undergraduates we were taught that the noble gases did not form 
chemical compounds. Then several noble gas compounds were discovered in the early 1960s [11]. 


This is why chemistry does not have much to say about regularity. It is part of the 
chemical credo, rarely expressed publicly, that anything we can imagine is possible. The hidden 
foundation of this belief is, probably, the isomorphism of thoughts and things in chemistry. 


21 


Chemical chimeras can be inspired by some very distant reality. For example, the 
chemical structures in Figure 3.1—two interlocked rings and a ring on a dumbbell—were 
synthesized just because of the challenge of the shape. They are, so to say, materialized ideas. By 
doing that, a chemist becomes an architect. 



A 


Figure 3.1 Topology of catenanes (A) and rotaxanes (B) 

If atoms can combine and form various configurations in our imagination, they most 
probably can do it in a flask. This kind of philosophical idealism would not survive for long if it 
were not confirmed by experiment. 

“Yes, everything thi nk able is possible in fact, but how stable is it?” The chemist asks this 
kind of question in the situation where the mathematician would ask, “Yes, but how probable is 
it?” Here lies an apparent difference between PT and chemistry. Where the mathematician 
thinks in terms of probability, the chemist (and, probably, architect) thinks in terms of stability. 
There must be some serious reasons for that, and the main reason is that the configuration space 
in chemistry is a scale of sets, while a mathematician would probably prefer to deal with 
individual sets, however fuzzy. It does not mean that the chemist has an aversion to probabilities, 
but they are expressed as concentrations, i.e., probabilities to find a molecule of certain kind in a 
unit volume. The law of large numbers works over the entire “real world” chemistry, with the 
exception of special cases when a single molecule is the object of investigation, as in protein 
folding. 

An important set of rules of regularity comes from quantum chemistry. Quantum- 
chemical regularity means preserving a certain arrangement of the external (valent) electrons 




22 


between atoms. A single line portrays the covalent chemical bond, typical for complex organic 

molecules. A double or triple line between two atoms (example: H—C=C—H) does not mean 

two or three identical bonds: the second and third bonds are very much different from the first, 
and the chemist always keeps that in mind. 

Without going into quantum-theoretical specifics, chemical stability requires the 
configuration to have certain regular neighborhoods of atoms. The regular neighborhoods have a 
certain number of outbonds at the central atom, each of them portrayed by a single bond 
corresponding to a pair of shared valent electrons. The sum of valencies is eight (octet rule), for 
some of the most wide spread atoms, two for hydrogen, and usually up to twelve for other 
atoms. It seems like a lack of regularity, but each particular case is explainable by quantum 
chemistry. 

Figure 3.2 illustrates the use of octet rule. 



Figure 3.2 An illustration of octet rule 


Atom A fonns a stable compound AH 3 because the neighborhoods of H (hydrogen) have 
regular pairs of shared electrons, and the neighborhood of A has the regular eight shared 
electrons. The different fill of the little circles symbolizes different contributors to the shared 
pair. The structure is regular and, therefore, expected to be stable, which is always better to 
check experimentally. 

However, even the octet rule is ridden with exceptions. Thus, there are simple molecules 
where the number of electrons at an atom is seven, as it is at atom A in Figure 3.2, where B 
means just another atom, not boron. Nitrogen dioxide, a brownish gas present in the engine 
exhaust, belongs to this type. The transition between the two fonns, monomer and dimer, the 
latter being quite regular, happens to fall in the interval of temperatures commensurate with 



23 


human existence. At the room temperature, the monomer prevails, while at the temperature of 
melting ice, the dimer dominates, so that the brown color of the monomer disappears. Regularity 
in chemistry is relative: it is more or less defined for a certain temperature interval. 



B=A=B H- B=A=B 

• o 



Hot 


Cold 


Figure 3.3 Example of a formally irregular but stable structure (B=A=B) 


For most other chemical bonds, the interval of reversible dissociation and association is 
much higher, somewhere near the red heat. At any temperature, the two forms are in equilibrium: 

BAB + BAB ^ (BAB) (BAB) f31 } 


Or, in a more typical for chemistry notation, 

2 AB 2 ^ A 2 B 4 (3.2) 


The position of the equilibrium between these two forms can be exactly calculated 
because equation (3.2) describes a canonical Gibbs ensemble with complete list of entries, 
leaving nothing to imagination. In chemical notation, concentration, i.e., the probability to find 
a molecule of a certain type in a volume of the mixture, is symbolized by square brackets and 
defined by: 



24 


[A 2 B 4 ] 2 

|-^g j A ; [A 2 B 4 ] + [AB 2 ] = const (3.3) 

Since the total concentration of molecular species, [A 2 B 4 ] + [AB 2 ] is known, the 
concentrations of each species can be calculated. 

Naturally the temperature dependence of equilibrium constant K is expressed by 
RT\nK= Ga 2 b 4 - Gab 2 > (3-4) 

where T is temperature, R is constant, and G stands for (no surprise!) Gibbs energy, 
which in chemistry is misleadingly called free energy, the term that will be avoided here. We 
shall return to it later. 

Equation (3.1) is not the only possible transfonnation. For example, we can imagine a 
decomposition of two BAB into perfectly stable two BB and one AA. And in fact, this is 
what happens in the catalytic converter of an automobile where the poisonous nitrogen oxide 
N0 2 is decomposed into hannless nitrogen N 2 and oxygen 0 2 . Moreover, the position of 
chemical equilibrium favors this decomposition! However, it practically never happens if the 
oxides are left at nonnal conditions. 


If unexplained, the chemical liberalism could take a good bite off our faith in chemistry. Of 
course, a chemist can eliminate the spontaneous decomposition simply because it has never been 
observed at normal temperatures (it will definitely happen at high temperatures). But there must 
be some scientific reason for that if chemistry should not be taken for a kind of soot-saying. 

Certain imaginary things happen in fact, since the times of Jules Veme, but others are 
impossible or need sophisticated tricks to make them real. It seems important to find the 
constraints of realism to be applied to such areas of thought as mathematical systems, statistical 
mechanics, and Gibbs ensembles, too beautiful for the challenges of real life. 


The reasons for the irrational stability of N0 2 and the persuasive power of catalytic 


converters will be considered in Part 8. 



25 


Theoretically, any combination of atoms is possible, but most of them have such a high 
energy that they could not exist at normal conditions. The idea of regularity in chemistry, 
therefore, is tied to conditions. What is regular at one set of conditions becomes irregular at 
another. Irregularity appears only if we define regularity. 


Thus, if we trust the Bible or the Koran to define a set of regular ideas, any other idea becomes 

irregular. 

Chemical regularity, therefore, forms a continuum. Nevertheless, there is a common 
understanding of what is practically regular. Regular is what is stable. Regular molecules are, 
approximately, those that that can be kept in ajar, at least overnight. This probably can be said 
also about thoughts: a regular idea is the one that can survive in the mind, say, five minutes, all 
the more, overnight. Probably, scores of vague and irregular ideas fly through our mind every 
minute, some of them even beyond verbal expression, and most subconscious. Certainly, what is 
fonnulated in words has at least a look of regularity. 

The chemists operate with energies instead of probabilities because energy is measurable 
while probability is not. Energy is li nk ed in a chemist’s mind with stability: low energy means 
stability and high energy means instability. Of course, as with everything in chemistry, the 
borderlines are diffuse. 

The chemists do not use probability as a measure of stability, and for a good reason: the 
complete list of combinations of atoms has a complexity of the scale of sets. Chemical systems 
do not have a list of all configurations in the configuration space. There is no set of exhaustive 
and mutually exclusive alternatives or events. This is why chemists rarely use absolute energies, 
either, focusing instead on differences. 

It looks like the chemists are intuitively Bayesian in their approach. The typical question 
is: we know for sure that the system is in the state A. What is the probability of the state B at the 
next observation, on the condition that it really happens and nothing else does? To ask such 
questions, we need to imagine B first. Suppose, there is A and derived from it B. How probable 
are both if there is nothing else? On such conditions, we arrive at the classical statistical 
mechanics that ignores what cannot be imagined. 



26 


For a chemist, the question what is going to happen the next moment is rhetoric, 
especially when dealing with evolving systems. The chemist must know the options beforehand. 
The “Bayesian” fonn of question, however, allows the chemist to operate not with absolute 
energies but with differences, i.e., ratios of probabilities, deliberately idealizing the system. 

If regularity in chemistry is closely tied to stability, i.e., to energy, it is no different from 
the situation in any Gibbs ensemble, but over time, not over space. If we consider a disjoint 
chemical configuration in a single copy, the expected result is also a statistical ensemble over a 
long observation time, provided the system is isolated. Naturally, if the temperature goes up, the 
regularity will relax. 

Chemical synthesis consists of mixing stable compounds, letting the reaction take its 
course under certain conditions, and, finally, isolating stable products. 


To compare, thinking consists of acquiring reliable data, letting the process in the mind take its 
course, and, finally, formulating the results. In creative thinking, the timing of the process is 
hardly predictable, which is an oblique evidence that thinking deals with a single mental 
configuration. 


This implies that a chemical transformation runs through a series of intermediate 
irregular configurations that do not last for too long. Otherwise, the result will always be the 
Gibbs equilibrium. It should be noted that some reactions come to equilibrium pretty fast, but so 
do competing undesirable reactions. 

Unlike mathematics, chemistry not only refuses to deal with infinity but is also hardly 
ever interested in chemical reactions that take a long time, except in winemaking. The chemical 
system is dramatically different from a Gibbs ensemble, which is devoid of the time axis. 
Chemistry is as much controlled by kinetics as by thennodynamics. 

A realistic model of the mind, therefore, must accommodate not only for an expanding 
configuration space, or, in terms of PT, an expanding envelope, but also for the kinetics that 
distinguishes between theoretically and practically possible events. 


27 


4. Thoughts and molecules 

Figure 4.1 looks suggestive, but can thoughts have a “chemistry” in more than metaphorical 
sense? The formal treatment of assemblies in set theory and configurations in PT brings the 
chemical fonnulas and ideas into the class of fonnal constructs. If so, do thoughts about 
molecules behave like molecules? This question invokes the spirit of Godel: are the Godel 
numbers of statements about numbers in fact statements? Coming back to the Bourbaki’s theory 
of structures, in which a mathematical theory contains rules that assign to assemblies the status 
of terms, signs, or theorems, while the rules themselves do not belong to the formal mathematics, 
we find that the above question can be answered positive at least as a hypothesis. To add more 
weight to the answer, we first need to know how the molecules really behave. 



Figure 4.1. A thought and a molecule 



28 


Although texts look like polymers of words, there is a notable difference between texts 
and molecules. The number of atoms is limited by the Periodic Table but the variety of ways 
they are connected is enormous. The number of linguistic terminal generators, i.e., words, is 
extremely large, but the variety of their connectors is practically exhausted by a grammar that is 
much thinner than a telephone book of a small city. 

To name its countless objects of study, chemistry uses its own language with a very 
special relation between the sign and the signified. Regarded as configurations, the sign is 
isomorphic to the signified. In the terminology of linguistics, it is a pictograph, i.e., the most 
primitive method to code objects by their pictures and to paint a beer mug as the sign for a 
tavern. 

All ancient symbols for small numbers were pictographs: the symbol contained as many 
elements as the number it symbolized, and it is still true about small numbers in Arabic and 
Chinese, and, of course, Roman numerals: I, II, III, V, and X. The three first initial Chinese 
numerals are the same as the Roman ones, only horizontal. 

In most languages, a distinction exists between letters and words. The words are 
combinations of letters, but not separate letters. Chinese characters, used also in the Japanese 
language, in some aspects, level out the field for letters and words. Various characters, when 
combined, acquire a new meaning and become a new character, see Figure 4.2. If so, where does 
this end? Characteristically, in the Chinese and Japanese languages the space between the words 
does not exist (there are some means of quasi-spacing in the Japanese language). There is no formal 
difference between characters and words. 


■iilf Person A in 

\ \ Person J1, uho does 

contrast Tit = great 

J Lj talking P = elder brother 

jfof/ Heart seeing 

Double happiness ^ 

I7L light A - sudden insight 

tto = marital bliss 


Figure 4.2 Examples of compound Chinese characters. Components great, 
seeing light, and happiness are themselves compound (source: 
http://zhongwen.com/). 






29 


The borderline between molecules and thoughts is blurred in the practical work of a 
chemist who manipulates atoms and molecules on paper, in mind, or with a computer. In the 
mind of a chemist, ideas about molecules behave like the molecules in the test tube. 

When Hamlet contemplates his actions in sequences of about 30 symbols of English 
alphabet and syntax, there is no direct conformity between the text and the action, and not much 
even between the text and the speech. Symbol S for entropy in physics looks more like a lithe 
snake than a state of flaccid disarray. The Godel number is a far cry of a statement it enumerates. 

The situation is different in chemistry where the chemical fonnula is still a clear 
pictograph. If it is converted, according to some rules, into a line of text from which the original 
structure can be reconstructed, it is no more a pictograph. The chemists do not think in the 
chemical names of substances, unless for a quick communication. They discuss pictographs of 
molecules. Thus, the idea of water in the head of a chemist, not a layman, is isomorphic to the 
actual molecule of water. Thinking about water, a chemist imagines its three constituting atoms, 

bonds between them, and sometimes even the angular shape of the molecule (A or < ). The 

chemist is used to think in images of molecules and their transfonnations, letting them play 
according to the rules of the chemical game. To compare, a musician may think about music and 
remember it in finger and hand movements, in addition to sounds. 

The chemist deals with ideas of atoms as if they were labels pasted on real atoms. In 
imagination, helped with a sheet of paper or a computer simulation, the chemist lets the atoms 
and groups dance, bounce, and recombine. The imaginary balls and connecting sticks collide and 
adjust to each other in the mind of the chemist before the stage of the actual experiment, as well 
as post factum. Sometimes the chemist does it without any guiding idea, but more often, draws 
from known patterns. Then, after testing the idea in an experiment, the virtual play is resumed, 
and so on, until a satisfactory result is achieved. 

Similar strategies appear to dominate other areas of creativity: writing poetry, doing 
science, and inventing. Thus, a poet starts with some word, line, or just an idea and grows this 
seed into a polymer of words, marking up its stressed areas, awkward interference of words, and 
mutual repelling of lines, changing the sequence until the stress is minimized (or enforced), and 
doing that in imagination or on paper. 



30 


The great Russian poet Anna Akhmatova: “If only you knew from what dirt the poems 
shamelessly grow.” 


Search and fitting for the right rhyme could be as painstaking as shopping by a movie 
star. The entire creative process, whether in science or, at least, in pre-postmodern humanities, 
falls into the same category of processes as protein folding, during which a line of a polymer 
through twisting, bending, and writhing, takes the most stress-free shape. The poet, however, 
invents his or her own thermodynamics. 

Moreover, a large stretch of history can belong to this kind of processes, too. For 
example, the history of France, since the French Revolution up to our days, manifests a 
remarkable series of twists, turns, and contortions, through which the nation has been 
“stochastically relaxing” the contrast between the personal authority and the will of the people, 
see Part 5, Stability in LMS, Figures 5.3 and 5.4. 

In the pattern model of the mind along Ulf Grenander [1], generators of the mind 
combine and recombine according to their affinity, which is a function of the previous state, 
random component, and stable constraints of the system, such as partitions of the generator 
space. Designing an artificial mind, one can take a different direction from this point, but the 
pattern-theoretical platform, covering a whole range of structures in the sense of Bourbaki with 
imposed constraints, will remain the same. 

Chemical experience, in a way, is materialized PT. 


31 


5. Stability in LMS 


It may seem at this point that we have departed from any artificial or natural intelligence, but in 
fact this brings us to the historical beginnings of AI. 



Fig. 5.1 Interactions in the homeostat 


Concerning stability, it is appropriate to pay tribute to one of the almost forgotten ideas 
from the dawn of AI. In 1940, the homeostat of W. Ross Ashby (1903-1972) joined the class of 
systems consisting of particles interacting under constraints. 

Ashby’s Design for a Brain [12] and An Introduction to Cybernetics [13] are a 
fascinating reading. Ashby seems today, when the general ideas are buried under the sediment of 
narrow and technical papers, well ahead of his time because his model was, in essence, a 






32 


constrained by structure “condensed matter,” close to that of Ising, where topological neighbors 
influence each other. Figure 5.1 can be seen as an illustration to Ising model. Not only that, but 
the entire conceptual direction of Ashby presents, on the surface, an alternative to the celebrated 
Turing’s criterion of intelligence. Ashby’s refrain is adaptation. The criterion of intelligence, 
therefore, is adaptation to the intelligent environment. 


Whether it is accidental that the adaptive AI was bom in the native land of Darwin and not where 
Darwin has always been under siege—this is a good topic for a Turing test. An intelligent robot 
would start talking about the weather. 


Extrapolating Ashby’s reasoning, since the environment, populated by humans and 
animals, from a pack of wolves to a scientific symposium, itself can possess intelligence at 
various degrees, the intelligence under consideration is measured, like physical temperature, as 
the intelligence of the environment with which the individual can stay in adaptive equilibrium at 
least for a day. According to the Turing test, the agent is either recognized as intelligent or not, 
while the Ashby test (not explicitly expressed by him) could give a measure of intelligence. We 
use this measure in everyday life. If we can match a person in a conversation, we have at least 
the same intelligence. The adaptive criterion echoes the Murphy’s law: everybody reaches his or 
her level of incompetence. As energy in chemistry, the IQ tests are nothing but the measure of 
the difference in intelligence of the test author and the object of testing. 


Of course, an objection could follow that a buffalo has better chances of survival amidst a 
scientific symposium than facing a pack of wolves. The counterarguments are suggested as an 
exercise. 


While the constraints in a spin glass are topological, the constraints in Ashby’s primitive 
system of four interacting “particles” are coming from the physical nature of the hardware, 
namely, from the mechanical inertia of its parts. Being disturbed, the homeostat, after a series of 
apparently random movements, comes to equilibrium. This system can be roughly compared 



33 


with four men (as the least reasonable gender) packed into a telephone booth: when one tries to 
turn, he disturbs the other three, until they all find the least uncomfortable new position. This 
kind of system has the topology of full graph, Figure 5.1, the same as for ideal gas where any 
particle can collide with any other. The connector, however, can be any. Ashby insisted that a 
large number of particles were needed for intelligence. 

What is essential for us is the very type of behavior of Ashby’s adaptive system. The 
homeostat reacts to a disturbance by entering a short transitional mode of intense movement and 
subsequent coming to a new equilibrium. 

From the chemical point of view (which may be disputed by some chemists), reacting 
molecules in chemical environment behave as a homeostat. The action of a reagent knocks the 
molecule out of balance. A short-living transition state of the running reaction emerges that leads 
to a new equilibrium. 

The mind, notwithstanding the inner mechanisms, behaves in the same way: a 
disturbance, whether external one or an inner fluctuation (“it struck me,” “it just occurred to 
me”) triggers a state of intense work and, therefore, an increased energy consumption until a new 
state of equilibrium, however temporary, is reached. 

Evolution of society consists of periods of balance and stability interrupted by wars, 
revolutions, and drastic reforms. During the periods of unrest, the society is seeking ways to 
diminish the social stress, and if it is not achieved, the unrest can take a catastrophic form. A 
history of a nation consists of periods of relative calm interspaced with times of turbulence. The 
most spectacular example of the search for stability is history of France, Figures 5.3 and 5.4. 

In social psychology, various balance theories emphasize the importance of an internal 
cognitive balance. If two opposite systems of belief violate the balance, the individual tends to 
adjust them in order to diminish the cognitive dissonance (Leon Festinger, [14]). A smoker 
trying to quit passes back and forth between smoking and non-smoking phases through a painful 
state of cognitive dissonance. 

Ashby’s ideas were more refined than it may seem: they included the concept of step function, 

which was close to the concept of mutation applied to behavior. 



34 


What is so unusual and pioneering about Ashby’s model is its thermodynamics, which he 
noted only casually. It is an open system, maintained by an external source of electric current, 
without which it would come to a hold once and forever. The current brings into motion the cells 
and feeds the output. The system’s behavior, Figure 5.2, consists of alternation of stable and 
unstable (i.e., probable and improbable) states. 



Figure 5.2 Alternation of stable and unstable states in Ashby’s homeostat 

LMS systems are usually characterized as adaptive. It means that they tend to achieve a 
state with an optimized parameter, let us call it stability, from which a spontaneous transition to 
another state is unlikely. If brought into a state far from optimum, they begin a search of a 
pathway toward stability. A war always ends with peace, but peace is punctuated with wars. The 
evolution of a stock market belongs to the same type of behavior. 




35 



Figure 5.3 The energy profile of the French Revolution. Peaks mark instabilities. 


For a physical system, the stable state is that of equilibrium and the measure of stability is 
energy. The stable state in an open system is not equilibrium but a steady state with a minimal 
production of entropy. 



A problem with open systems is that thermodynamics alone is not able to predict what 
kind of configuration will correspond to the stable state because thennodynamics has nothing to 
say about structure.. For example, the turbulent times of the 1960’s changed American attitude 
toward war, but it is impossible to explain in which way by any general concept of homeostasis. 
Generally, we can expect a roller coaster after September 11, 2001, but of what kind? We need to 
enter the mind of the smoker as well as the mind of the nation and its enemies to compare 
stabilities of different configurations. 






36 


6. Patterns of Transformation. 

Taxonomy of real-world chemical transformations is not different from the operations on 
configurations in Patterns of Thought. Chemical configurations are commonly disjoined because 
they correspond to mixtures of different substances. 

The chemist starts with a set of configurations of all participating components and has to 
decide what is going to happen with it, if anything at all. 

Patterns of chemical transformations usually consist of alternating steps of dissociation 
and locking of bond couples. In order for the initial and the final products to be regular, some 
bonds should be broken and other closed because loose bonds are an evidence of irregularity and 
such structures are unstable unless isolated or highly diluted. 

Thus, the chemical reaction of substitution follows the pattern: 


A— B + C— D -» A— C + B— D 


37 


The difference in font symbolizes here the degree of our attention. In the neighborhood of 
A, B is changed for C. If we switch attention to D, C is changed for B. 

Substitution is one of a few general patterns. The lover level includes various specific 
reactions involving particular functional groups. For example, esterification : 


A—C—OH + HO—B -» A—C—O—B + H—O—H 

O O 


This reaction is an example of a more general pattern of condensation. 


A—X + B—Y -» A—B + X—Y 


It looks no different than substitution, and in fact, it is not. The particularity of 
condensation is that X—Y is usually a very simple molecule, typically, water. This 
circumstance is crucial for biochemistry. 

Although elementary steps are always simple, they can leas to a catastrophic change of a 
global connector. 



- Open chain- Cycle 

Conformation L Conformation C 


Figure 6.1 Cyclization of linear connector 



38 


Much more dramatic examples are kn own. Thus, the four cycles of steroids (Figure 2.1) 
can be closed in vivo from a linear molecule along a very short pathway. 

To jump from a line to a circle, the configuration should take a certain position in the 
Euclidean background space, shown in the middle by broken lines, Figure 6.1. The ends of the 
chain must come into a close contact. The open chain and the chain in this pre-locked position 
are just different conformations, not different configurations. 

In most chemical reactions, the events are strictly local in Euclidean background space. 
The local character of transformation in chemistry is a powerful means to manage complexity. 

Such transfonnations as cycle closure, with the exception of some special cases, require 
additional energy to overcome the high entropy of distant ends of the forming cycle, which must 
be somehow compensated by forming additional bond couples. 


39 


7. Chemical kinetics and transition state 

Chemical kinetics is concerned with the problem of the transformation speed. In pattern terms, it 
equals asking how fast the probabilities change toward the equilibrium. In the virtual world 
kinetics can impose limitations on the computation time, for example, the speed of convergence 
in stochastic relaxation, [2], p. 381. 

In the chemical language, the transfonnation speed means the rate of concentration 

change. 

Thus, for A B (concentrations are in square brackets): 

d[B]/dt = k [A] , where k is rate constant, specific for each transformation, 

[A] + [B] = Const. 

ForA + B^C: d[C]/dt = k [A] [B], [A] + [B] + [C] = Const 

[A] + [B] + [C] + [D] = Const 


For A + B + C -> D: d[D]/dt = k [A] [B] [C], 
For A + 2B C: d[C]/dt = k [A] [B] 2 , 


[A] + [B] + [C] = Const 


40 


In other words, the rate of transfonnation depends on the probability of finding all 
participating components in each other’s neighborhoods of Euclidean space. The collision of 
three and more particles is a rare event and complex transformations go through a sequence of 
simpler steps. 

Note, that the total concentration of particles does not change, unless so intended. This is 
one of three major limitations of the real world vs. the virtual one: 

1. Conservation of matter. 


2. Conservation of energy. 

3. Euclidean topology. 

The meaning of the conservation of matter, 
C, the probabilities P(G y ) to find generator G‘ 


in pattern tenns, is that in configuration space 
in configuration Cj must be strictly 


nonnalized: 


2/ ( g)) = i 


The chemist does not deal with the unbearable He-complexity of the chemical 
configuration space, but cuts a small sector out of it, using various heuristics. No wonder, the 
chemist is often surprised and frustrated when a dark goo and other unanticipated impurities foul 
up the flask, which happens also with the best of minds. 

Euclidean topology of the real world defines how entropy can be estimated. The 
transition configuration is possible only if all its generators occupy relative positions that make 
possible fonnation of new bonds without drastic displacement of atoms. 

Considering the process of thinking, a chemist would say that most ideas do not occur to 
a mind simply because the rate of their formation is too low. This statement would extrapolate 
chemical experience that asserts that, while everything is possible, only a few transformations 
are fast enough to be realistically considered. The problem is that transformations go through 
unstable and evasive transition states. The intimate chemical process is predominantly a 



41 


redistribution of electron density, which is something chemistry cannot portray by classical 
chemical fonnulas. 

The difference between the real dirty chemical world, the mind nested in the real brain of 
flesh and blood, and the mathematical abstractions should never be lost. Nevertheless, chemistry 
conforms to pattern theory pretty well. The main reason for that is that PT is as much a theory of 
regularity as of irregularity. 

The tacitly accepted universal chemical principle (see Part 3, Regularity and probability in 
chemistry) that if we have a certain starting molecular configuration, then anything we can 
imagine can happen to it, presumes that generators, i.e., atoms, are neither created nor destroyed, 
and chemical regularity is preserved. The words “anything we can imagine” used here remind 
again about the more than metaphoric parallel between molecules and thoughts. 


Putting side-by-side real molecules and thoughts may seem a mortal methodological sin. The 
definition of set given by Georg Cantor, “By a set we mean a grouping into one entity of distinct 
objects of our intuition or our thought” (quoted from [10], p.322), gives us a hope of absolution. 


In a real chemical system, the factor of time can be more important than the position of 
equilibrium between the configurations involved. In real life, the absolute majority of possible 
outcomes are never realized for kinetic reasons: because of the conservation of generators, a few 
fastest transformations quickly consume the starting ones and the transfonnation rate drops. The 
slower transfonnations are, therefore, self-inhibiting. Note that the underlying reason is the 
limited resource of atoms. Different transformations compete for a limited resource. 

Similarly, for a real mind in a real environment, there is hardly ever enough time for 
stochastic relaxation. The need to act interrupts the search for “the truth” or the optimum. The 
competition for time limits the relaxation even in the game of chess, not to mention real life 
because not all alternatives can be optimized and compared. 

While the chemical system is always conservative, regarding the generators, it is not 
clear how this could be realized in the chemistry of thoughts. Intuitively, there could be only one 
generator “red” or “big” in the mind, but of course, scores of them in a text or speech. It is hard 



42 


to accept the idea that thoughts exist in duplicates, unless in different minds. It seems to violate 
Aristotle’s law of identity. Copies are not identical in some aspect: location, storage, whatever. 
Something must be different, otherwise, two copies are just one. 

Another reasonable question is: if most transfonnations are not realistic, why do they 
happen at all? 

The crucial step to understanding why some chemical reactions happen, while most of 
them happen only in the mind, was made in 1889 by Svante Arrhenius (1859-1927), who, by the 
way, made other crucial steps in chemistry (electrolytic dissociation), as well as outside (ideas of 
panspermia and greenhouse effect). 


The aside remarks of the last paragraph may seem a digression if we forget that we compare the 
thoughts and the molecules. 


Svante Arrhenius had a clear idea that even if the transfonnation is feasible, only a part of 
all collisions between molecules result in a change, which, in real life, parallels the fact that only 
a part of all acquaintances result in friendship or marriage. He turned his attention to the rate 
constant K, which had been a strictly empiric number, and found that 

K= A exp(-A G / RT) , 

* 

where K is the rate constant of the reaction, AG is energy of activation, A is the fraction of 
“productive” , i.e., consummate, collisions, and R is constant. 


Universal gas constant R is a form of Boltzman constant k adjusted to the molecular mass: R = 
N k, where N is the number of particles in a mole of any pure substance (Avogadro’s number). 


* 

Remarkably, AG is always larger than the energies of the initial and final states. It is the 


difference between the energies of the transition state and the initial and final states. 



43 


The form of this equation (same as for equilibrium) suggests that K reflects nothing but 
the probability of the initial configuration to reach energy G*. It follows from statistical 
mechanics of J. Willard Gibbs (1839-1903), whose name was the source for Gibbs energy G that 
plays a crucial role in the real world because it accounts for its order, as well as chaos: 

AG= AE-TAS, 

where S is entropy and E is the chaotic energy of heat (also notated as Q). Here TAS takes to 
account the degree of order that subtracts from the chaos. Note the deltas: chemists think only in 
differences. 


Gibbs energy assumes that the pressure during the change is constant. There are other measures 
of the so-called free energy, i.e., energy capable of producing work. Gibbs energy is so popular 
because most chemists work at atmospheric pressure to avoid explosions, unless they use steel 
vessels. 


* 

Following Arrhenius, only the molecules that reach energy G {activation energy ) are 

capable of transfonning into products. A more metaphoric interpretation is that there is a barrier 
between the initial and final products that is necessary to overcome for the transfonnation and 
only a small part of molecules with the asymmetrical Maxwell-Boltzmann distribution have 
energy sufficient to overcome it. 


Turning to real life, to win somebody’s heart, you have to be not only insistent (E), but also liked 
in return (S), if odds (T) are against you. 


The change of entropy in chemistry comes from the change in degrees of freedom in 3D 
space during the transformation. For example, the closure of a large ring from a linear chain is 
accompanied by a significant loss of entropy, the longer the chain the larger, because the ends 
have to find each other in ever larger space. As the chemists say, “it is difficult.” To continue the 



44 


molecule-thought parallel, creativity means connecting distant ideas, because what is close on 
hand is obvious. 

The entire picture of chemical equilibrium, therefore, looks like in Figure 7.1. 

Between the stable initial state A and the stable final state B (the initial and final states 

* 

can be reversed) lies an unstable/improbable/irregular transition state AB . It is the 

equilibrium between the stable states and the unstable transition state that determines the rate of 
transformation. 

The position of equilibrium between stable states does not depend on the energy of the 
unstable state because its concentration is negligible. 

* 

It is presumed that the equilibrium between AB and both A and B establishes much 

faster than the equilibrium between A and B, and only because the probability/concentration of 

* 

AB is low, it becomes the true bottleneck of the transformation. 

Rigorous logic may find obvious gaps in this reasoning. We still remain within the 
equilibrium paradigm and do not introduce any new and radical ideas about the kinetics itself. It 
is not clear how time can enter the picture. 



Fig. 7.1 Transfonnation of A into B 



45 


The study of absolute reaction rate and all details of the transition process is a separate 
area of theoretical chemistry of a strong quantum-mechanical flavor. It is beyond the scope of 
this paper. It must be noted, however, that the transition state is not a stable substance that could 
be kept in ajar overnight, but the entire process of transformation from A to B. A possible basis 
for our kinetic assumptions is that what happens in the transition state usually does not involve 
any radical displacement of the atomic nuclei and is limited to the redistribution of electron 
density, which in fact happens, by the human time standards of day and night, practically 
instantaneously. Anyway, in spite of its logical incompleteness, the concepts of transition state 
and activation energy serve chemistry very well because chemists, as intuitive Bayesians, operate 
with differences and not the absolute values. They are not interested in the absolute probability 
of a structure, which not only can be attained in an infinite time but also depends on what other 
structures are involved. It will suffice that the structure is stable. 

Chemists are trying to figure out the most probable configuration on the condition that 
the known (sometimes, partially) initial configuration takes place at time to , and the next 
measurement is taken at time t n , i.e., for A-> B , they have to calculate P (B t=n |At=o). To 
remind, the probabilities are expressed as concentrations. A chemist, however, practically never 
has a complete set of alternatives, which is exactly why chemical experience might present 
interest for pattern chemistry of the mind. Moreover, the chemist is usually interested in 
achieving a certain stable state with maximal probability, which is what an average mind carrier 
pursues, too. 

Whether animals think or not is open to questioning, but they certainly pursue goals. 
What we call goal is the imaginary final state separated from the real initial state by a transition 
state, which is not quite clear in all detail. While pursuing goals, humans and animals alike, or, 
rather, their minds, work as typical chemists, optimizing the sequence of mental configurations 
ending with the goal. The usual problem is that another overlooked final state is quite possible. 
As the story of king Croesus and the oracle of Delphi tells us, final states could be well beyond 
our imagination even if we have reliable cues. 

Along Herodotus, the oracle predicted that Croesus would destroy a great empire. The destroyed 

empire turned out to be his own. 



46 



Initial Transition Final 


Figure 7.2 An incomplete transfonnation space. Broken lines 
symbolize other alternatives, not even imaginable. 


The typically chemical situation looks like in Figure 7.2. There are a few possible 
transition states, each leading to one or a few final states, sometimes, overlapping. The chemical 
tree of choices is usually incomplete and most of the alternatives are discarded due to prior 
knowledge (the chemical “system of beliefs”) as well as because of clear cut principles. It is 
natural to assume that a similar tree of choices, in the form of regular and irregular 
configurations, should be optimized equally by a wolf or a businessman planning to make a kill, 
while the deer and the competitor have their own trees. 

Bayesian inference has been a subject of numerous arguments and the opinion of a 
chemist weighs little, but a chemist would suspect that few arguments could arise over a well- 
defined system. Neither in chemistry, not in intelligence, however, nor, for that matter, anywhere 
in the world, except in the speeches of political leaders, do we have well-defined systems—a 
suspicion first expressed by Heraclitus. Thus, an escape from a prison or a catastrophe on the 
September 11 scale are possible only because neither the prison security nor national security are 
well-defined systems and some transition barriers happen to be invitingly low when the 
temperature is high enough. 



47 


The idea of transition state comes from the mere observation that nothing happens at once 
and there must be a reason why events happen at all. If not for the barrier of the transition states, 
all the molecules on earth would react and fall into equilibrium and all “life’s persistent 
questions” could be solved in an instant. In fact, nothing is instantaneous. 



Figure 7. 3 Metaphor of bonding 

Figure 7.3 metaphorically illustrates the concept of transition state on the process of 
adding a key to a key ring. In the transition state, the ring is deformed and brought into an 
unstable shape, which alleviates the attachment of the second key. 

Transition state as process is very well known in technology and meteorology. Everyday 
human activity also consists of stable periods punctuated by short-living transition states, among 
them, transitions between sleep and alertness. In Asby’s homeostat, all steps of the transition 
state are observable and recordable. 

The following is an illustrations how the same chemical transfonnation can go through 
different transition states. 

There are two possible mechanisms of recombination A—B + C—D -> A—C + B—D , 


which the chemists call substitution. 



48 


Generator space: A, B, C, D. 

Mechanism 1. 

Stage 1: Dissociation of a bond 

C—D -» C* + D* 

Stage 2: Fonnation of a triple transition complex and the new bond 

A—B +C* -» C...A .B 

Stage 3. Formation of a new bond 

C....A .B -> C— A + B* ; 

Stage 4: Recombination of fragments 

B* + D* -> B—D 

Mechanism 2. 

Stage 1. Dissociation of a bond 

C—D -» C* + D*; A— B A* + B*; 

Stage 2. Recombination of fragments 

A* + C* A— C ; B* + D* -» B—D 





49 


The arrow means everywhere the reversible transfonnation ^4. 

In Ulf Grenander’s chemistry of thoughts, a configuration with incomplete bond couples 
is considered irregular, [1], (2.4), which is consistent with how the chemists see it. Irregular 
extra bonds, exceeding the regular arity, could also be added to that. 

The transition state is always an irregular configuration, usually, with the rule of octet 
violated. Since transition states cannot be portrayed by common chemical fonnulas, new 
symbols were invented for them after their nature had become clear, and of course, by definition, 
they cannot be stored in ajar. It does not mean that they cannot be studied. Chemistry itself is in 
a perpetual transition state, few things are static, and we can expect in the future much more 
detailed knowledge of the intimate mechanisms of chemical transformations. The same can be 
stated about the science of the mind. 


50 


8. Catalysis 


The question arises: how could transformations be made simple, selective, and fast if typical 
reactions in organic chemistry are notoriously slow and tend to run in many directions 
simultaneously, generating mixtures of products? 

The most powerful tool in the chemical time managing is catalysis. Catalysis is a cardinal 
chemical concept that opens a passage from the classical equilibrium chemistry to the non¬ 
equilibrium phenomena of LMS. It prevents chemical chaos inside the living cell. It is easily 
rationalized in pattern terms. Naturally, some subtleties, meaningful for a chemist but 
nonessential for the pattern picture, will be sacrificed. 

The closest pattern relative of catalysis is signal or message in information theory. While 
information changes the probability distribution of outcomes toward lower entropy, the catalyst 
does the same with transition states. 

Catalysis has no influence on the position of equilibrium, but, like an earthquake, 
dramatically warps the kinetic landscape. Similarly to signal and earthquake, it makes a notch on 
the time axis. It starts at the moment of introducing the catalyst into the system, but in due time, 
as any information, becomes an old hat. Let us note this property of novelty, typical for LMS, 
because we shall return to it later. The catalytic effect dissipates with time in the same way every 
news becomes an old hat next day. In the long run, the effect of the disturbance created by the 


51 


catalyst is erased by equilibrium. The catalyst itself returns chemically unchanged, as an arrest 
warrant, after it has been read and produced the desired dramatic effect in a detective story. 

In pattern terms, catalysis is based on the interplay between regular (solid lines) and 
irregular (broken lines) bonds, Figure 8.1. The irregular bonds in question are perfectly regular 
from the point of view of physics, but not from the pattern-chemical one. They are weak, labile, 
and often multiple. 


Initial state 


Transition state ; 

1 

V B 

A 

B 

A B 

l 

Catalyst 

Catalyst 

\ / l 

/ I 

/ l 

f ■ 

Catalyst ! 

I 

1 


Final state 

A-B 



Catalyst 


Figure 8.1 Pattern catalysis. Note that A is in a wrong 
initial position for binding 

The transformation involves three initially disjoint configurations: A, B, and Catalyst. 
Configurations A and B could be subconfigurations of the same configuration, as, for example, 
the ends of a linear chain. 

The entire initial configuration can have high entropy in the Euclidean space where the 
particles fonn bonds only in particular spatial orientations and at a close distance. To bring them 
into the bonding position, with much lower entropy, the energy is borrowed from “irregular” 
weak chemical bonds shown by broken lines. The transition state undergoes a transfonnation, 
coupling the provisional and irregular bond A—B, which becomes regular after the catalyst splits 
off the transformed substrate. This happens sooner or later because of the reversibility of all 
stages of the process. 

The temporary bonds are “irregular” only against a certain definition or standard of 
regularity in a certain way. As we saw, the intuitive and fuzzy understanding of chemical 
regularity amounts to the question of stability of a chemical substance. In fact, it is not the walls 
of the glass jar on the lab shelf that prevent the transfonnation but the invisible walls of kinetic 





52 


barriers. It is appropriate here to compare the kinetic barriers to the barriers of understanding 
erected by ignorance, prejudice, bias, and habit. 


There are different types of chemical catalysis. They all work in three basic steps. The 

catalyst 

(1) binds to a specific substrate (Gibbs energy decreases), 

(2) forms a new transition state (Gibbs energy increases but stays lower than for the non- 
catalytic transition state), thereby strongly increasing the speed of the transfonnation, and 

(3) separates from the changed substrate after the transfonnation is complete (Gibbs 
energy decreases and can be either higher or lower than in the initial state). 


By catalysis we mean here only the so-called heterogeneous and template catalysis, omitting the 
homogenous type, which is somewhat subtler. 


The distinction between the strong and weak bonds is crucial for catalysis, as well as for 
all living systems. The weak bonds are very labile, which means that they close and break up in 
the temperature range where the strong bonds cannot dissociate into highly irregular loose atoms. 
The weak bonds could be compared with joined magnets or pieces of modeling clay, while the 
strong bonds are, so to say, locked by bolts and nuts and can be taken apart only with a tool or an 
explosive. At a very high temperature, of course, all bonds are weak and chemical complexity 
disintegrates. Properties of various bonds is illustrated in Figure 8.2. 

Catalysis simply restructures the energy balance sheet of the system, without changing 
the bottom line. This is why it has no effect on the position of equilibrium and equally enhances 
the direct and the reverse transfonnations. The systems like life, mind, and society exist because 
the initial intake of free energy and its dissipation into heat and simple metabolites prevent the 
equilibrium. 

If the three initially disjoint configurations—two components of the substrate and the 
catalyst—lose their independence, it means a significant drop of entropy, which requires energy 



53 


to compensate for. If the AS contribution to AG=AQ - TAS decreases, AG increases, and the 
reaction slows down. The source of this energy is formation of weak and labile bonds that are 
outside the area of "regular" strong chemical bonds that are difficult to break (hydrogen bonds, 
Van der Waals bonds, electrostatic interaction, interaction with the solvent, etc.). The disconnected 
irregular fragments of regular bonds have high energy and low probability. The fragments of 
weak irregular bonds, for example, the bonds between molecules of water, are individual and 
stable regular molecules. 


stable strong 
regular bond 




unstable 

fragments 



unstable weak stable 

irr egul ar b on d fragm ent s 


Figure 8.2 Properties of regular and irregular bonds 


The most interesting case is when the substrate and the catalyst confonn to each other 
like lock and key. To continue the analogy, some important catalysts work rather like the safe 
deposit box and two keys to it. Figure 8.3 illustrates this case by the joining of the two keys with 
a single ring, this time aided by fixing the positions of the keys, which simplifies the operation 
and eliminates unnecessary fumbling with the objects in the transition state. 

It is important to emphasize that the catalyst works in both directions and does not 
change the position of the equilibrium, which does not depend on the transition state. The 
remarkable effect of catalysis can be seen only at non-equilibrium conditions, which is exactly 
what life, mind, technology, and society require for functioning. In the long run, the 




54 


thermodynamics overrides kinetics, and the catalytic effect is completely erased. In a non¬ 
equilibrium open system, however, nothing is “long-run”. 

Transformations of relatively small compounds—small in comparison with polymers— 

* 

may go in just a few directions because of large differences in AG between different transition 

states. If we deal with gigantic polymers, like proteins and nucleic acids, where the configuration 
space produced by combinatorial explosion, all the polymers have close energies. In such a 
degenerated system, an ultimate chemical mess would be expected if not for the catalysis. The 
catalyst brings order into a highly chaotic system and, therefore, works no different of any 
package of information. 



Figure 8.3 A metaphor of catalytic transformation: the joining of two keys by a 
ring is made easier by the fixed positions of the keys, of course, not by two locks. 


A degeneration of configuration space parallels the situation in a computer where any 
sequence of symbols of the same length requires the same time—and energy—to transfer and 
process. Classical thermodynamics in both systems does not depend on the content of a 
configuration, which is always a linear sequence of a limited number of symbols. Unlike the 
chemical system, the computer is, ideally, completely frozen and nothing happens in it unless on 
command, while life and society have a significant degree of spontaneity and can extract 
information from the world 

Finally, we shall fonnulate what catalysis means in pattern terms, without any recurrence 
to chemistry: 







55 


Two generators in the neighborhood of the third are in the neighborhoods of each 
other. 

Or: 

Coupling of two generators with the third strongly increases the probability of 
their coupling. 

Or: 

The probability/affinity of a bond between two topologically closest generators 
sharply increases if they are bonded to the same third generator. 


The concept of catalysis presents a curious problem related to the lack of the concept of 
history in physics, where anything that happens for the first time has to be pre-existing. In 
information theory, the events display between the input and the output and they are of repetitive 
and reproducible nature. If a pinch of the catalyst is dropped into a flask, however, this may 
never happen again in the same system with the same result. 


W.Ross Ashby noted that circumstance while observing his homeostat. He suggested (in an 
exaggerated interpretation) that the unique event would simply mean that, for example, the input 
voltage has been zero for half eternity and then changed to 1 and, maybe, back to zero, where it 
could remain for the second half of eternity. 


An alternative point of view is the fundamental for LMS concept of evolution consisting 
of a chain of unique events, with the current result explainable only in terms of historical record. 
The concept of pattern evolution means that the generator space and the configuration space, as 
well as regularity, cannot be fonnulated axiomatically once and forever. The historical (or 
Heraclitean) systems are characterized by a complete absence of ergodicity: the representative 
point never passes the same cell twice. Paraphrasing Heraclitus, one cannot step into the same 



56 


phase space twice. It expands, contracts, and warps as in the most audacious sci-fi movies. What 
is remarkable, however, the same patterns can be traversed repeatedly. He-systems can be 
ergodic at the pattern level. There are no principal obstacles for accommodating this view by PT, 
which could open a case on patterns of history, where transition state would take the central 
place and patterns can be revisited in the pattern space. Nevertheless, the parameter of novelty 
precludes the ergodicity of classical statistical ensembles. 


In the 1950’s and 60’s, when physicists turned their attention to life, an apparent impossibility to 
assemble a large molecule from atoms without pre-existing information had been the greatest 
mystery. From the point of view of physics, origin of life was impossible, although life itself was 
thermodynamically understandable as a non-equilibrium open system. From the point of view of 
chemistry, there was no problem at all: in a very large He-complex configuration space, 
complexity can develop from a very simple combinatorial base set in a long sequence of simple 
steps. 

Looking back, the conundrum was a natural consequence of the inherent reluctance of the 
pre-computer physics to deal with irreducible complexity, i.e. objects represented by lists that 
cannot be compressed into formulas. Chemistry, on the contrary, developed as an art of 
navigating through the irreducible complexity of millions of individual compounds and their 
transformations. Although chemistry has arrived at a set of general concepts, similar to a well- 
balanced literary analysis of Shakespeare, the true joy of chemistry can be found in the lab, which 
is a perennial Wild West. 

The basics of the chemical art of navigating in large configuration space include the 
principle that the change in a large system is sequential and local. Similarly, the marine 
navigation is based on assumption that the ship does not hop all over the world ocean, from one 
hemisphere to the other, but moves through a sequence of close positions on a 2D surface. 

On the contrary, abstract statistical ensemble of independent particles has no such 
limitation: under the principle of ergodicity, a sequence of states may be arbitrary, as for the gas¬ 
like ensemble of colliding particles. The main property of the virtual world is, probably, that any 
state of the computer monitor can be followed by any else, even though the sequence of 
intermediate stages can greatly vary. In the real chemical world, the jumps of the phase trajectory 
are rare and short. At the same time, jumps in the mind are quite natural. 


57 


9. Competition and Selection of Configurations 


If cybernetics was the alchemy of the mid-20th century, then Walter, 
Ashby, Beer and Pask were the Magi. (Andy Pickering, [15]). 

Artificial Life (often abbreviated as Alife or A-life) is a small universe existing parallel to the 
much larger Artificial Intelligence. The origins of both areas were different. Alife arose as an 
abstract mathematical study of generalized life after the molecular biology had taken its modern 
shape and the molecular mechanisms of life had become transparent. On the contrary, AI, as if 
anticipating a slow progress of the study of the intimate mechanisms of Natural intelligence (NI), 
took up the mind as a black box and focused on imitating and amplifying its functions. 

While AI is a huge area of research involving many people who pursue practical goals, 
Alife, though ambitious, remains a kind of an intellectual game. Having come from chemistry, it 
presents interest as generalization of non-equilibrium and non-linear chemical experience. 


The full story of AL, with purely chemical roots going back to Svante Arrhenius, is beyond the 
scope of this paper. Its modem chapter starts with chemist Manfred Eigen (Nobel Laureate in 
chemistry, 1967). People in AI are aware of his works, mostly from the point of view of game 
theory, while Alife is well aware of the Ising model. Ilya Prigogine and Manfred Eigen inspired 
the modem science of complexity, which today is best of all represented by the Santa Fe Institute 


58 


(Stuart Kauffman, John Holland, Christopher Langdon, Peter Schuster, and many others). It 
studies the theoretical aspects of LMS, excluding the US Tax Code, the most complex creation of 
all. 

Ross Ashby captured the state of evolutionary divergence of the emergent A1 into 
computer and life sciences in his An Introduction to Cybernetics [13]. In spite of Alan Turing’s 
own biological interests, the Turing Machine gave a great impetus to A1 toward computer science 
rather than life science. Nevertheless, the anthropomorphic Turing test was based on live human 
intelligence as the reference point and was, in essence, a test for adaptability of A1 in the 
environment of Nl. It seems that the advent of computers greatly favored the formal A1 in 
competition with adaptive Al. Today, however, when the exchange of ideas is as common as the 
exchange of flu viruses, Al, Alife, and science of complexity look like a single continent in the 
shape of the Americas with Alife as the Panama Isthmus between the science of complexity and 
Al. 


What is abstract life without trees, elephants, and humans who try to destroy and save 
them all and each other? Alife tries to answer the question “what is life” by modeling the major 
properties of life at the basic molecular level, but, in the pattern spirit, not being overly obsessed 
with chemical formulas. Alife accepts the Darwinian paradigm that life is competition of species, 
such as trees, elephants, and humans, for limited resources in an evolving system. The term 
evolving system means not only that the system is thermodynamically open and far from 
equilibrium, but, more important for us, is a system of chemical type, i.e., evolving in a complex 
and rich configuration space. It is a system of competing configurations and patterns that 
reproduce themselves. 

It is a matter of convention whether generators and configurations exist in multiple copies 
in both structural chemistry and PT. In Alife, as in the material world, generators exist in copies 
and configurations can multiply. In Al, as in mathematics, two identical ideas are just one idea. 


The multiple generators of whatever kind do not contradict the general framework of PT if we assume 
similarity transformation COPY. 



59 


The phenomenon of reproduction has been long known in chemistry as autocatalysis 
without any biological connotations: the catalyst catalyses its own formation. Figure 9.1 shows 
how it works in an abstract form. 

Configuration D (disjoined, as it is common in chemistry), consists of four generators A 
and four generators B existing in four copies of each. Two generators are coupled. The 
muliplicity of free generators simulates the availability of building blocks in the environment. 


AAA 

BBB 

A—B 


D 




A—B 

1 1 

A A 

1 1 

A—B 

B B 



K 


L 


Figure 9.1 Template catalysis in replication 

P(A- -A)>P(A- -B ) leads to K, P(A- -A)<P(A- -B ) leads to L 

Suppose, generators can fonn regular and irregular bonds. There are two cases: identical 
generators fonn a stronger (more probable) irregular bond, shown by broken lines, than the 
different generators: 

P(A- -A)>P(A- -B ) or G(A- -A) < G(A- -B ) 

Then A—B, as in Figure 9.1, or A—A and B—B otherwise, will catalyze their own 
replication. One dimer works as a template for another. 

In the opposite case, when the greatest affinity is between different generators, the 
replication will be complementary, as in natural life. Note, that the errors are inherent in this 






60 


kind of probabilistic reproduction, depending on difference/ratio of energies/probabilities of the 
two irregular bonds. 

The question arises, if free uncoupled generators are irregular, how can they be loose and 
lonely, as in Figure 9.1? They cannot. In biochemical systems, the strong bonding between 
generators is, at mild temperature, practically always not a coupling but a recombination, with 
the participation of water and its fragments. It is called condensation ; the reverse transformation 
is hydrolysis. 


Condensation ■=!> A —H + B —OH U A — B + H—O—H <=■ Hydrolysis 


For biopolymers to be fonned, a supply of Gibbs energy is required because the 
equilibrium involving water is shifted toward hydrolysis. This energy comes from ATP, the 
universal energy carrier of life. The biochemical details are not essential for us, however. What is 
essential, biopolymers can fonn only in a system with consumption of Gibbs energy and its 
dissipation into heat. 


We do not talk about “supply” of thermal energy when considering a system at a certain temperature. We 
need, however, a supply of Gibbs energy because it is a perishable commodity, turning into heat if not used 
for work. 


As soon as we postulate replication in a chemical system, it becomes a model, although 
very much incomplete, of life. The pioneering work of Manfred Eigen [16, 17], who investigated 
the kinetics of this type of systems, gave an impetus to the whole new area 

Manfred Eigen was concerned mainly with linear sequences for two reasons: the linear 
model is the simplest for simulation and it is the closest to natural linear chains such as proteins, 
DNA, and texts. 


Manfred Eigen’s main question was: what can happen in this kind of a system from the 
point of view of chemical kinetics, provided the configurations act as self-replicators, i.e., each 



61 


polymer enhances the formation of its copy from monomers. To ensure replication, a supply of 
generators (monomers) and Gibbs energy to link them were postulated. The linear polymers 
multiplied like the hare and lynx in Lotka-Volterra’s famous model. The total number and length 
of generators existing in multiple copies in the basic model were kept constant, although that was 
not a necessary condition. 

The replication is prone to errors and the most probable one is a mutation in a single 
generator. Hammond distance defines the metrics in the sequence space. Figure 9.2. The 
sequences longer than trimers occupy the vertices of hypercubes with most probable mutations 
along the edges. 


AAB AAA 



BBB BBA 

Figure 9.2 The sequence space for trimers 


The realism of Eigen’s model was not in detail but in the type of kinetics that it 
employed. It was a starting theme for a multitude of subsequent variations in Alife. 

In the simplest case, the imaginary system under consideration comprises n species (linear 
configurations) with xi, X 2 ... x n individuals in each. 

In Eigen's model, the simplest system is described by the following kinetic equation: 

dx/dt= A/Q/X/ - D/X/+ ^ w yXj 

j*i 


(9.1) 



62 


x,- / dt =0 

i 

where AjX >0 is the component of self-reproduction (birth), diminished by Q, < 0 as quality 
parameter, i .e., the measure of errors during reproduction, 

D jXj stands for spontaneous decomposition (death), and 

W ijXj is the production of species X/ through mutations from all the other species. 

j*i 

It is the canonical fonn of kinetic equation for a system with gain, loss, and interaction. The 
first tenn is growth, not necessarily through replication, and the second tenn is decay. The third 
tenn represents influence of topological neighbors. Replication is the only way a population can 

grow, but the equation is not limited to discrete metrics: X/ can be, for example, probability. The 
presence of neighbors assumes a topology. 

Eigen’s simple system possesses Ar-complexity: all generators are listed in advance. The 
configurations are combinations of symbols of a certain length. Neither the symbols nor the length 
enter the equations. As Eigen noted, for combinatorial reasons, the number of all possible sequences 
can easily exceed the number of available monomers. The tenn evolution can be applied here in its 
nanow physical sense, as the movement of the representative point over the phase space of a 
however large but constant volume. 

The third tenn of the equation is similar to the tenn of the exchange with the neighbors in 
Ising model. The connector, however, is not a 2D lattice but the sequence space. 

The mathematical picture of competition, based on replication with enors, is very general. It 
seems that various biological and physical models belong to the same general class of processes. 
The class is described by the differential equations that in the most general case contain the 
following tenns: 


1. Birth (or remembering, multiplication, growth, fonnation, self-perpetuation, 
success) 



63 


2. Death (or forgetting, extinction, decline, dissipation, decomposition, failure) 

3. Interaction with topological neighbors in terms of probability of birth and death 
(in a relaxed version, success and failure). 


The phenomena of birth and death demarcate the border between “plain chemistry” and 
biochemistry, or, to put it differently, between meta-chemistry and LMS. 

The possibility of birth and death in an open chemical system needs explanation. It is 
appropriate to distinguish between two types of bonds, positive and negative, see Figure 9.3. 



Figure 9.3 Positive and negative bonds 

The positive chemical bonds require energy to break them and the negative bonds require 
energy to form and maintain them. All chemical bonds, whether strong or weak, are positive: the 
stable bond forms with the release of energy. This is why catalysis is possible: the initial bonding 
with the substrate is spontaneous. 


The quantum-mechanical picture of chemical bond includes the negative and neutral versions as 
anti-bonding and non-bonding molecular orbitals, along with the regular bonding orbital. They 
are solutions of Schrodinger equation. 


In biochemistry, where the bonds are formed by condensation, with the evolution of 
water, the equilibrium is thermodynamically shifted toward hydrolysis (breakup) of the 



64 


sequence. Therefore, a bond couple between monomeric generators can fonn only on the 
condition that energy is delivered selectively to the site of bonding, which is done through ATP 
(Adenosine Triphosphate) in the presence of an enzyme. 


This is not possible in spin models, where the energy may come only from a field. Thus, 


the Ising model has the Hamiltonian H = ^ ' - J (7 -CT j - h ^ G ■ where h is a uniform 

(U) i 

applied field. This illustrates the dramatic difference between physical and LMS models. 


Thermodynamically, life is a dissipative structure, like the eddies in the turbulent flow, 
Benard structures in water in a container on a hot plate, tornadoes, and other phenomena existing 
only due to the supply of energy. The physics of dissipative processes in connection with life was 
explored in-depth by Ilya Prigogine, the founder of non-equilibrium thermodynamics, and his 
school [Prigogine is author of numerous books on the subject]. As Eigen formulated, “Systems 
of matter, in order to be eligible for selective self-organization, have to inherit physical properties 
which allow for metabolism, i.e., the turnover of energy-rich reactants to energy-deficient 
products, and for (‘noisy’) self-reproduction. These prerequisites are indispensable. “ [16]. 


The mind, located in the brain, 2% of the body by weight, consumes 20% of energy. We 
do not know how this energy is used, but it is obvious that only a small part of all daily content 
of the brain leaves any trace in it, while a part of the earlier stored material disappears. Two 
hypotheses on thought dynamics can be inferred from this fact. 

The first hypothesis of cardinal importance is that the thoughts, or whatever 
configurations in the brain are behind them, must compete for the energy supply in order to 
come to existence, albeit for a short time. Like the polymers in biochemistry, they need Gibbs 
energy, and, probably, they are polymers of a kind, in the pattern sense. 

The second hypothesis is that the conscious thoughts are just the top of the iceberg, like 
the most active molecules in Maxwell-Boltzmann distribution. They reach the high energy level 



65 


of consciousness. Below them are scores of other—subconscious—configurations that fall into 
the medium part of the distribution. 

The remarkable consequence of competition for a limited resource is that it displays in 
time because of the form of the canonical equations. The goal of the “survival” is to “multiply 
existence” toward the next moment. This is how time enters the picture. 


The complexity of life appears to be a consequence of an extremely large size of biosphere, while 
the complexities of modem society come from the limited size of the globe. As far as mind is 
concerned, the interesting question is the relation between the size of the real world and the upper 
and lower limits of the size of the mind that can adapt to it. Obviously, the size of the world of a 
squirrel is not only small but also fluctuating within a limited range. The size of the human world, 
however, is that of an expanding He-system. Probably, it was the use of tools and language that 
launched the expansion. It can be classified, in Eigen’s terms, as a hypercycle: each segment of 
the closed cycle catalyzes the next. Without the scale-of-sets mechanism, however, it is difficult 
to understand why life has been growing more complex throughout history. The population 
dynamics alone does not explain it. 


The mathematical analysis of Eigen’s system showed a complex behavior depending on the 
ratios of the coefficients in the equations. The selected configuration is a population, not a single 
species, with a probability distribution of configurations around a certain average. The population 
may either drift through a sequence space or come to a dominating population, or experience a 
collapse of infonnation content. Populations are characterized by their fitness, which is a value 
similar to energy (but is maximized instead of minimized in spontaneous processes). The 
populations have certain positions on a fitness landscape, where those with high fitness (high 
stability) reside on peaks, while the unstable ones take up valleys ([5], p. 199). 

It is the concept of a landscape that unites all LMS systems because it introduces a universal 
function similar to Hamiltonian in an abstract metric, but generally non-Euclidean, space. Within 
the framework of Alife, it is the sequence space. Ulf Grenander calls its intellectual version 
mindscape ([1], 2.9). As it can be seen from figure 7.2, a chemical system of a sufficient 
complexity is also a peculiar collection of points on an energy landscape ( chemscape ) consisting of 



peaks of the transition states and valleys of stable states. The positions of the points are neither 
independent nor static. 


66 


The concept of fitness landscape is appealing as a metaphor, but highly controversial. Thus, 
according to the concept, on a smooth landscape, two close sequences may have close fitness, while 
on a rugged landscape, a step aside may have dramatic consequences. But why? The concept does 
not provide any causality. Moreover, it does not take to account the kinetics that may drastically 
disrupt the bliss of classical statistical mechanics. Besides, as Alife realizes, the landscape changes 
right under the feet of populations. The LMS science of the future can draw realism from chemistry 
in its generalized form of PT, however complex it may seem. 


Next, some general principles of evolving He-systems will be illustrated by computer models. 


67 


10. The Competitive Mind 


Birds do it, bees do it 
Even educated fleas do it 
Cole Porter [18] 


If molecules, species, things, and humans compete, then thoughts—their shadows in the mind— 
must do it, too. 

Henri Poincare (1854-1912) seems to be the first to express, around 1901, the concept of 
thoughts competing in the mind for a limited place in consciousness [19]. His reasoning had the 
same starting point as, much later, Manfred Eigen’s. As Poincare put it, it would take a whole 
lifetime to examine all thoughts and facts in human head one after another, although the absolute 
majority would be absolutely useless for thinking in a certain direction. Poincare was puzzled by the 
observation that the irrelevant thoughts were unable to step over “the threshold of consciousness” 
and ever come to the mind of the thinker. 


What is the cause that, among thousands products of our unconscious activity, some are called to 
pass the threshold while others remain below? Is it a simple chance which confers this privilege? [19] 


68 


Antonio Damasio, who works in the area of neurobiology of the mind but has a bird’s eye 
view of the entire area of mind research, used a similar language in 1999. 


.. .1 sense that stepping into the light is also a powerful metaphor for consciousness, for the birth 
of the knowing mind, for the simple and yet momentous coming of the sense of self into the 
world of the mental, [20], p.3. 


Manfred Eigen noted that out of about 10 100 proteins of moderate length (more than the 
whole universe can comprise) the absolute majority are quite useless for living organisms and do 
not exist. Both Poincare and Eigen pointed to the process of selection, which in the historically 
cloudless times of Poincare meant only Darwinian selection. Eigen specifically mentioned Darwin. 


We have to derive Darwin’s principle from known properties of mater, [16], p.469. 


The only difference between the terms competition and selection appears to be that selection 
is the result of competition. Formally, however, selection, meaning reducing a set to its subset, can 
be done without competition. Biology makes distinction between natural and artificial selections. 
Selection here is understood as natural, i.e., stochastic. In the adaptive systems of AI, based on 
learning, the creation of infonnation is the result of teaching, while in natural selection the human 
teacher is nowhere to be found and the set narrows because of the competition for a limited 
resource. 

The central question for us is how to apply the kinetics of selective systems, developed by 
Eigen, to the mind where configurations do not fonn multiple copies and there are no populations of 
chemical or biological type. A tentative answer is that instead of concentrations, typical for 
chemistry, we have to come back to PT probability. In order to do that, we need a well-defined 
system with a finite set of possibilities, which seems impossible for a He-system by definition. As 
we shall see, we will still be able to have populations of a kind: in time, not in space. 



69 


The problem, therefore, splits into two sub-problems: competition on He-structures and 
building up a He-structure that could code the history of the system. The first problem amounts to a 
change of connector from hypercube to tree or an irregular Bethe-type lattice, as a prelude to scale 
of sets. A model for a He-configuration space will be considered in Part 11, SCALE. 


70 


10.1 BIRDS 


To illustrate the principles of simulation, we start with an Ising-type square lattice as connector. 

The model is called BIRDS because its behavior is reminiscent of a flock of birds that 
sometimes change the direction of flight in a coordinated way, but with a certain degree of dissent. 
It does not aim at simulating avian behavior, for which there are plenty of much better models. 

The description of the algorithm follows. 

The program calculates directions of birds in a flock. 

There are 36 birds B m (m=l,2...36) in a flock. 

The topology of the flock is a rectangular 6x6 grid. 

The birds are numbered: 


1 

2 

3 

4 

5 

6 

20 

21 

22 

23 

24 

7 

19 

32 

33 

34 

25 

8 

18 

31 

36 

35 

26 

9 

17 

30 

29 

28 

27 

10 

16 

15 

14 

13 

12 

11 


71 


The order of numbering is chosen to differentiate between inward and outward bias for 
some experiments. The program compensates for the lower arity on the fringe of the flock. 

Most birds have 8 close neighbors. For example, #29 has # 13, 14, 15, 30, 31, 36, 35, and 
28 as its neighbors. 

Each bird can move in 32 discrete directions a; (i=l, 2.. .32). 

* 

The probability that bird B m moves next moment in direction a; is P m i . 

* 

The next state of the system is determined by the 36 x 32 matrix of probabilities P m ;. 

BIRDS calculates the probability distribution *P over all directions for bird B m as: 

*Pm,i = Pm,i+M m>i F+^(g k i P k i) , kAm, 

k 

where P m j is probability in the previous distribution, M m> ; is memory about previous 

distribution , F<1 is parameter of forgetting, g k ; is parameter of influence, k^m , and k 

marks all neighboring birds. 

The new distribution is stored in memory. M m ; = P m ; for all birds and directions, 
except for the previously selected direction. To make room for experiments, each bird 
remembers its last selected direction I as P m \ — C, but the neighbors remember it as P k j = H. 
It is not reflected in the above equation. 

It means that each selected direction stores C for this direction in the memory of the bird, 
but the neighbor who witnessed the turn accepts it as H. The higher C, the higher individual 
conservatism. The higher H, the higher global coherence. 

Interpretation: each bird tends to move in the same direction, but it also tends to forget 
this direction. It is also influenced by its perception of the movement of several close birds in the 
flock.. 

An anthropomorphic interpretation is that the bird “thinks” about choosing a direction, 
taking to account its neighbors, and 32 thoughts compete in its brain. There is only one winner. 



72 


Parameter H reflects how a bird, that suddenly changes its direction, is perceived by its 
neighbors: do they take it seriously? H is the measure of “seriousness.” C is parameter of 
conservatism, F is parameter of flexibility. The H/C ratio characterizes how much attention the 
neighbors pay to each other’s initiatives. Parameter g is generic, taking values gin and gout (see 
below). 

The program has two kinds of output. One is the grid with the numbers of directions, 
Figure 10.1.1, and quiver diagram, Figure 10.1.2. 


C ' 2 

1 R 

1 c 

1 R 

1 R 

M * 1 

1 4 

1 7 

1 R 


i n 

r . o. 

7 

) 

i 

1 c 

1 c 

i a 

9 ' t 

1 4 

n * S 

1 R 

1R 

i ^ 

9 in * 

±£ _ 

± 2 _ 

g a ii l* ( 

±£ _ 

.a 

1£ _ 

1£ _ 


1 2 a J s 6 


Figure 10.1.1 Example of grid output; directions are red numbers at the nodes 


/ >>v / 
/ r - // 
/ / (/ // 
/ 

/ / / / 

/ / ^ / 


Figure 10.1.2 Example of quiver output 












73 


The quiver output is generated by script qb. It is the very last line of the program. 
If it is removed from the code, the grid output will be seen. 

Additional parameters: 

n: number of program cycles; 

d: initial direction, one of 32 around full circle. It is not present in script bf2; 
gd: influence of the last direction d on the probabilities of directions d+1 and d-1, i.e., 
“circular neighbors” on the dial of directions. Instead of this stiff distribution, nonnal 
distribution for directions could be used, but not in this program, 
gin: influence by a neighbor with a higher number (inward), 
gr: bias toward the center of the flock or outward; gout=gin*gr 
gout: influence by a neighbor with a lower number (outward). 


Script bf is used to start the simulation with equal probabilities. It erases the previous 
information in master matrix FL. It has preset parameters that can be changed by entering new 
ones. After that, script bf2 is recommended for changing one or a few parameters, with the 
arrays intact. 


To use BIRDS, copy the workspace BIRDS.mat, copy scripts bf.m , bf2.m, and bq.m. 

Start by entering bf. After that, use bf2. For bf2, n=l to 10 is recommended. In this mode, drastic 
changes of direction can be observed even after one cycle. However, only after a certain number 
of cycles (50-100), the system enters the collective mode. In terms of physics, it reaches a steady 
state far from equilibrium. At C=l, coherence is low, but at C=2 it is clearly seen. 


An example of several consecutive outputs is shown in Figure 10.1.3. 

In terms of Ising model, BIRDS is a lattice with 32 values of “spin” number. The nodes 
tend to coordinate their “spins,” preserve the previous position, but lose its memory due to thennal 



74 


relaxation. Competition takes place between the values of “spin,” the selection ratio is 32:1. Small 
changes of direction are preferred. Parameter gd relates to the degree of preference (none for gd=l). 

In terms of collective human behavior, the model approximately corresponds to a group 
of students in an auditorium choosing among a set of actions and trying to accommodate the 
actions of neighbors. There are many other similar interpretations in the area of collective 
behavior. 



* 

—— ^ 




j 

: v: 


* 

• 

t 





2 >W / 

/ / / 

J/J / t / 

J J [d / 

/ V / / 

/ / / / 


/ / / / w 
'-v / / / / 

Js ^ \ 


Figure 10.1.3 Examples of output of BIRDS 








75 


Although BIRDS imitates some aspects of social behavior, its connector is socially unrealistic. 
The social topology in the Communication Age is an intriguing topic. The connector is by no 
means a full graph, as early enthusiasts of Internet expected. Its typical subgraph is, probably, a 
star with many rays. 


The program size does not depend on the size of configuration space. The model consists 
of two layers of different complexity. One has the local complexity of arrays and the other has 
the global complexity of the parameters. The devil here is not in the details of the arrays, which 
are updated by the program, but in global parameters, which require input and can be numerous, 
generating a combinatorial complexity of their own. Nevertheless, their number never comes 
even close to the size of the generator space. 


76 


10.2 PROTO 


Let us try to apply the idea of BIRDS to the competition of mental patterns (“thoughts”) for an 
unspecified limited resource, which we call, following Poincare and Damasio, the spotlight of 
consciousness. It is just a name of a model that, as any model, should not be understood too literally 
unless it looks persuasive to a specialist closely familiar with the real world. 

The following scalable system, called PROTO, simulates some aspects of selection on a 
connector representing an area of knowledge. It is simpler than BIRDS. 

PROTO calculates the probability of a generator to win the competition for the next 
selection. This time, unlike BIRDS, we have only one winner for the entire generator space. 

To remind, in a scale of sets, each configuration or pattern is represented by a single generator. This 

is why the difference in terminology between generators and configurations does not really matter. 

At the same time, the difference between the up or down direction of bonds, important in [1], is 

important in PROTO. 

The connector here mimics a radial lattice of Bethe type, also known as Cayley tree, Figure 


10 . 2 . 1 . 


77 



Figure 10.2.1 Connector of PROTO 

The topology of the connector graph with 31 nodes can be seen as a subscale of sets built on 
the base of 16 elements. It can be partially interpreted, for example, as: 


1. Cat 

4. Tiger 

25. Animal 

2. Dog 

17. Pet 

29. Living form 

3. Lion 

18. Wild feline 

31. Material object 


The connector can be made much more complicated, for example, nodes 1,3,4, and 18 can 
all be connected to an additional node feline (25A), connected also to node 25, Figure 10.2.2. The 
connector does not need to be regular and the Cayley tree can be somewhat disheveled. 









78 



Figure 10.2.2 Node 25A, feline is added to the connector in Figure 11.2.1 


PROTO calculates probability P, of selecting generator G* (/ = 1, 2, ...31) at time t+1. The 
generators are selected at random, according to their probability distribution. 

We attribute the following properties to the generators in PROTO: 

1. The longer the generator is in the spotlight of consciousness, the longer it will stay there, 
by making, so to say, its own copies in time. 

2. The longer the generator stays in the spotlight, the shorter time it will stay there because 
of forgetting. 

3. The neighbors in the connector graph positively influence the generator. 


The question how a generator can sense the probability of a neighbor will be left here without an 
answer, but with mentioning the wave function in quantum physics as a very distant metaphor. To 
keep up with the Johnsons is another one. 


Calculation starts with the distribution of probabilities at time t. PROTO remembers only 


the previous probability distribution and is a typical Markov process in an Ar-system. . 





79 


The following equation describes the behavior of PROTO: 


Pf +i = A^-FM,+ ^^g # (,0.2.1) 

i*j 

M; = Pf , except for the generator selected at time t , for which Pj = C, but Pj + ^ =0. 

This is not a necessary modification of the basic Eigen’s equation. It is done in order to 
exclude the earlier selected generator {acton) from the next selection and prevent the stagnant 
repetitive selection. It simulates the phase of rest after the excitation of the neuron and keeps the 
ball of excitation in the air. Another complication, for the sake of experiment, is that the 
neighboring generators accept acton’s probability as C*H, but in acton’s own memory, the past 
probability is written as C, similarly to how it was done in BIRDS. To simplify the picture, A=1 
everywhere. 

On this basis, the probability distribution for the next selection is calculated. The absolute 
values of parameters do not matter because the distribution is always normalized. 

The model uses parameters: 

F, forgetting; C, memory of an acton right after selection; H, factor of perception of 
acton’s probability by neighbors, and g, factor of influence, which for the tree connector can be 
split into gup and gdown, depending on the direction on the tree. 

Parameters gup and gdown are designed in such a way that the arity of the node is taken 
to account. The experimenter can play with them, too. 

At a certain combination of parameters, the system can freeze around some nodes, slowly 
drift over its phase space, retaining its compactness, scatter over large areas, or it can jump 
between distant areas, imitating a spontaneous hypothesis about some new properties of the 
world. This link between distant nodes is a precondition of creativity. 

PROTO spontaneously scans not just generators, but also their bond couples. The phase 
trajectory of the system, i.e., the sequence of actons, can be obtained as output. 



80 


Simply speaking, the more we think about the subject, the more we tend to think about it 
next moment, but the more we think about it, the more we get tired of it and tend to jump to 
something else (and this is exactly what various asides in this paper illustrate). These simple 
principles, as well as the influence of close neighbors, as Manfred Eigen demonstrated, describe 
“natural” selection of structures built of atom-like objects. 

It is important to bear in mind that we do not use here any knowledge about the nature of 
neural processes in the brain. A neuro-physiologist would say “firing” instead of selection, but 
we carefully avoid any physiological interpretation. Nevertheless, neurophysiology in an oblique 
way influences some aspects of the model. The “naturalness” of the model lies in the most 
general principles of competition and selection in living systems, but not in its details. 


The same principles define the Lotka-Volterra systems. The more hares today, the more 
tomorrow. The more hares today, the less tomorrow because the lynx will multiply. Such “games 
of life” over many millennia shaped new species, creating new genetic knowledge. 


The MATLAB program PROTO requires workspace PROTO.mat, is stored as scripts 
proto (main), proto 1, and proto2, and generates a figure of the tree with red asterisks meaning 
acts of selection (actons). 

The output of the system, depending on the number of iterations n , is a population of 
actons of size n over a segment of time t=n. To visualize the population, the positions of the 
asterisks symbolizing consecutive actons are slightly randomized. The green asterisk indicates 
the starting node. The starting node can be externally enforced in the beginning or at any 
moment during an experiment. In principle, two or more actons could be enforced, in a 
Pavlovian mode. 


To activate the program, save workspace (matrices P, T, and GXY ) and the scripts. The main 
program offers a choice of starting from equal probability distribution (proto 1) or continuing 
from the previous workspace, parameters, and probability distribution (proto2). The starting 
program can prompt for: 



81 


n, number of steps (selections), 
a, initial acton, from 1 to 31, 

C, memory of previous selection as acton; stored in the memory of the acton, 

H, factor for probability of the acton, as neighbors see it: P a = C*H. 

F, factor of forgetting; if set to a value >1, it corresponds to Eigen’s A>1. 

The following parameters are preset, but can be changed: 

gup=0.5, parameter of influence upon neighbors (toward the root of the tree), and 
w=2, gdown=w*gup. 

Prompts can be blocked or activated in the PARAMETERS SECTION of the script. 

The parameters can be changed by entering at any point between the executions. The initial 
settings are: 

H=0.7; F=0.9; C=l; gup=0.5; w=2; a=2; t=0.1 

Entering tra , after the program stops, gives the sequence of actons. Matrix PP (31,2) 
is the probability distribution, normally not displayed. 

To test the program, load the workspace, type proto and set the rest of parameters 
equal to one (which is not quite realistic). 

The program includes pause t after each selection, in the end of the code. Parameter t is 
set to 0.1 sec. 


Large C and F make the walk conservative and restricted to small areas. High H , gup , 
and gdown increase ergodicity. 

If some parameters should be kept constant in order to minimize input, the necessary 
modification can be easily entered into the code. 


The basic matrix P (32 x 11) in the code has a spare column 10 for modified experiments. It has a 
spare row 32 with zeros. The connector graph is coded with matrix T (31 x 31). The coordinates 
of the nodes are stored in matrix GXY. Number 32 is used instead of 0 to signify the absence of a 



82 


neighbor, in order to avoid a problem with zero coordinate of a matrix element. Row 32 stores 
zeros. 


A hypothesis silently incorporated in this model draws a parallel between the action of a 
catalyst in chemistry and the match-making properties of nodes. It states that two generators that 
are close to a third one are close to each other even if they are not coupled. For example, if in a 
certain knowledge representation CAT is close to PET, and DOG is close to PET, CAT and 
DOG are closer to each other than if there was no such triangulation. On the contrary, CAT and 
GRASS are not in the same neighborhood. Yet, if in our representation MOUSE is in the 
GRASS and CAT is in the GRASS, we expect a HUNT, i.e., a link between CAT and MOUSE. 
Regardless of that, MOUSE, CAT, and HUNT fonn a stable cluster in yet another or the same 
representation. 

The following examples illustrate the behavior of PROTO. 

PROTO was run at: a=2, n=50, F=0.8, H=0.2, C=2, gup=0.5, w=2 (therefore, 
gdown=2*0.5=l). The results are shown in Figure 10.2.3, where two populations of “thoughts” 
can be seen. 



83 



Figure 10.2.3 Two populations of “thoughts’ in PROTO , 30 cycles 


The trajectory of the system was (after initial acton 2): 

tra = 30 -* 28 2 30 2 30 28 2 28 2 30 2 30 28 30 -*• 

28 30 2 30 2 28 30 2 30 2 28 24 2 30 -> 2 


After adding 60 more cycles (total of 90), no new populations appeared, Figure 10.2.4 . 
The system was too conservative because of high C and low H. 









84 



Figure 10.2.4 Two populations of “thoughts’ in PROTO; 90 cycles 

Next, in the CONTINUE mode, the parameters were changed to: C = 0.5, H =0.5, F = 
0.5. Figure 10.2.5 shows the immediate scattering of one of the populations. After 120 cycles, 
however, PROTO still remembers its original acton 2, but keeps exploring its second area of the 
connector. 

Trajectory: tra= 30 -► 2 28 24 2 28 2 21 23 2 21 2 27 22 27-> 

21 22 27 2 22 2 27 2 21 22 24 23 22 2 -► 21 


Finally, Figure 10.2.6 illustrates two populations formed at a different, less relaxed set of 
parameters: n=30, a=2, F=0.8, C=l, H=0.5. 

Trajectory: tra = 8 2 18 27 2 27 2 27 2 27 2 27 2 27 18 2 

22 2 22 18 22 2 22 18 2 22 2 18 22 18 









85 



Figure 10.2.5 Two populations of “thoughts’ in PROTO; relaxed parameters, 30 
new cycles 

In this series, node 8 was the initial acton (after 2), but the populations there did not put 
roots. The bridge 17—25 between nodes 2 and 18 remains, so to speak, subconscious. 

The absence of limits imposed by percolation, i.e., possibility of jumps over the 
connector, seems to be the most striking property of the model. The jumps between distant 
nodes can be regarded as an evidence of dynamic memory or, in terms of the science of 
complexity, emergent behavior. PROTO remembers its initial acton for a long time, although 
it is not stored anywhere explicitly. 









86 



Figure 10.2.6 Three populations of “thoughts” after 30 cycles 

There is the famous problem of the origin of elephant's trunk: when it is small, it is of no use, 
when it is large, it cannot be explained by small mutations. Of course, there is Rudyard Kipling's 
explanation, but it is quite a stretch. The jumps in PROTO suggest that if the genome is organized 
as the scale of sets, then mutations can happen in such a way that the whole block of phenotype 
changes. 









87 


Discussion 

After many experiments, which would take too much space to report here, a conclusion could be 
drawn that the system manifests a rather rich behavior, depending on parameters. Even though 
the experimental generator space was small, the continuous multidimensional parameter space 
was large. An unexpected problem (probably, common for all virtual models) was the temptation 
to explore this primitive model and to play with it, instead of building a more complex and 
realistic one. Before the expansion could be done, some preliminary observations may be of 
interest. 

Even if we do not know what exactly consciousness is, we can tentatively answer what 
the subconscious is. We select in PROTO only the most probable candidate for the focus of 
consciousness, but we can compile a list of runner-ups, as in a beauty contest, with decreasing 
probability. Technically, we can do it by casting a random number over the distribution of 
probability remaining after the previous selection, collapsing the winning probability segment to 
zero and starting a new selection. Those subconscious levels of thinking form the subconscious 
bulk of the mental iceberg. They may or may not influence our thinking, which normally is 
mostly conscious, but not quite, and we do not even know to what extent because, by definition, 
we cannot see beyond the focus of consciousness. 

We shall also attempt to answer the question why we need consciousness at all and what 
its biological role is. 

In short, why are we able to describe something observable (a grazing horse) and convey 
it? We are able—involuntarily, because of the design of our mind where only one or two 
thoughts can come to consciousness at a time—to separate the thoughts horse and graze as 


88 


consecutive thoughts and, therefore, separate and express as consecutive terms of the sentence 
The same is even more true about describing complex and extended in time events. Since the 
content of consciousness is ordered in time, it is ordered in speech. Consciousness makes the 
parsing of reality possible. It is a condition for an analytical mind. 


One could see a deep analogy with the mechanism of ribosomal synthesis of a peptide over the 
RNA script. The ribosome selects one nucleotide triplet at a time, being “unconscious” about the 
rest. Of course, there is not much entropy in following the ID Ariadne’s thread. 


Language is a social phenomenon, understanding of an utterance is expected, and 
wherever we have communication between two animals or people about an external object, we 
may suspect a form of consciousness. A bird’s song or a lion’s roar communicate, probably, only 
the internal states of the animals, as a cell phone communicates a low battery or end of charging, 
but a warning shriek of a monkey seeing a predator seems to indicate consciousness and 
language. 

Consciousness spontaneously scans the content of the mind, converting it into a linear 
sequence of states. As we suggested earlier, the fact of understanding is confinned by the act of 
the communication of the result of understanding to somebody else. Consciousness, therefore, 
is a more technical that mystical term. The geese that saved Rome manifested their patriotic 
consciousness in their warning cries. 

Spontaneous thinking, in chemical terms, is an autocatalytic process, like life itself. The 
mathematics and physics of such processes were explored in detail by Ilya Prigogine, Manfred 
Eigen, and others, especially, around the Santa Fe Institute of Complexity. Order (new 
knowledge) is created in such systems because of the inflow of Gibbs energy and dissipation of 
most of it as heat. The difference is retained as order. In our model, the forgetting embodies the 
loss in the fonn of dissipation, and novelty creates new order. The conflict between the old and 
the new creates infonnation. 

The next step of development of this model should be fonnation and extinction of bonds 
and nodes as result of spontaneous activity, in other words, generation of new knowledge by 



89 


pure invention and mental game—not acquisition, because there would be no external source. It 
could go through stressed transition states created by an imposed problem. 

Our modest model alone does not justify the above far-reaching conclusions, hypotheses, 
and mere fantasies, but it may point to a rewarding direction of “artificial natural” intelligence. 
Following this idea, we may stumble at a definition of consciousness, for example, but we are 
not forbidden to look for its abstract foundations even without knowing what exactly it means as 
a chemical and physical phenomenon. 

In our model, no homunculus is required. The algorithm of natural selection is all we 
need. The system behaves as a non-linear cellular automaton. It is not a learning neural network 
because it gives no functional output. To see how a useful function could be built on the platfonn 
of spontaneous activity is our future task. 

Even if we design a humanoid utterly clever, we may confront unforeseen consequences 
that would conflict with our design. We must part with our creation, giving it the last touch, as in 
the fresco by Michelangelo. Moreover, we should expect it to discover forbidden knowledge and 
to revolt against its creator, as the image of Golem suggests. 


90 


11. SCALE 


Both BIRDS and PROTO are Ar-systems because their generator space and connector remain 
fixed. The purpose of SCALE is to simulate evolution of a He-system from zero, exploiting the 
property of novelty. 

The MATLAB program SCALE builds a record of its inputs ( history ) in the fonn of 
matrix WORLD, which is a sub-scale of sets, partially ordered along time axis. Time here is the 
discrete Leibniz time, i.e., the ordered set of events (“... time is an order of successions,” [21], p.25). 
Time does not move if nothing happens. 

Recall again that in the scale of sets, each configuration is also a generator. WORLD is 
the connector graph, the nodes of which are generators. Therefore, WORLD is a configuration. 
Each generator has a unique name, which can be a word or a number. The name is just a symbol 
and is extralinguistic: it is not necessarily a word or symbol of an existing language. SCALE 
stores the sequence of names in the order of their issue. 

In the beginning of the history of WORLD, the generator space contains a single 
“empty” generator with the name ‘. 


SCALE works in the following sequence of stages (examples of inputs are in square 
brackets). 


91 


1. SCALE displays: 

1 to start, 2 to continue [ 1 ] 

Once started by typing 1, it can be continued with the same WORLD by typing 2. Typing 1 
erases the previous WORLD. 

2. It asks for the number of cycles in the session by prompting: 

nn [16 ] 

3. It prompts: 

enter components [ '1 8 7' ] or [‘cat dog’] 

Components must be entered interspaced and as a single character string. 

4. If the input cannot be found in WORLD: 

This is new. Name it ['[' ] or [‘pet’] 

5. If the input is a single generator and can be found in WORLD as single generator, 
the output is: 

This is old 

If the input is a set of generators, for example, T e o n’ (spaced) the program asks: 


Looks like W32: Leon 



92 


Is it new? 1/0 


[ 0 (i.e., no) ] 


Output Looks like means that WORLD contains a permutation of the input, for example N o 
e 1. SCALE distinguishes between ‘abcd’,‘cad b’, and ‘abed’. The first two are sets of four 
signs, while the third is a single name, for example, the name for the first. 


Suppose, we enter ‘Noel’ and get the same response: 


Looks like W32: Leon 


Is it new? 1/0 


[ 1 (i.e., yes) ] 


If so, SCALE will ask for a name, which can be entered as ‘Noel.’ 

This feature is intended to simulate, in a simplified form, the coding of configurations, 
with similarity transformation PERMUTATION, as a model of more complex transformations, 
for example, SIZE. 

When a new generator enters WORLD, it is always either new single or a new 
combination of old ones. The connections between members of the combination and the name 
are stored in the WORLD matrix, expanding the connector. Therefore, throughout the history of 
WORLD, both generator space and connector change. 

The following example of a gradually created small world illustrates the variety of 
possible WORLDs. The WORLD (fragmentary) contains the words in English, Swedish, 
Russian, and Latin, Figure 11.1. The top line is the base of the scale of sets. 



93 


NAMES =! dog cat hund katt pyos kot 



Figure 11.1 Example of a polylingual WORLD with five 
sub-WORLDS: four languages and biotaxonomy (incomplete). 


Figure 11.2 presents a WORLD in 3D. It portrays some relationships between characters 
of TV sitcom Providence. The model was suggested by Ulf Grenander. 



Figure 11.2 The WORLD of a TV sitcom in 3D. History runs along Z axis. 




























94 


Axis Z is the historical time, i.e. the sequence of building the WORLD. 
In figure 11.3 the same WORLD is projected on the plane of the base set. 



• 1 -C.6 -0 2 > t.2 C A C.6 0* 


Figure 11.3 The WORLD of Figure 11.2 in 2D 


SCALE does not need a teacher or instructor because the names are not essential. The 
main property to be recognized for the evolution of the WORLD is novelty. WORLD, therefore, 
reflects not only a certain artificial or natural world, but also the history of the presence of the 
internal WORLD in the external world. The old is recognized, while the new is remembered. 


The WORLD “LINES” is built in the following way. 


Nine cells of a mini-retina are numbered as in Figure 11.4. Cell 9 is the empty generator. 




95 


1 

2 

3 

8 

9 

4 

7 

6 

5 


Figure 11.4 Numeration of cells in a mini-retina 

The WORLD fills up first with individual cells from 1 to 9 , named from ‘ 1 ’ to ‘9’ and 
then their combinations. For example, X is a combination of / and \. The following is a slightly 
re-formatted actual output. 


enter components 

T 8 7’ 

This is new. Name it 


enter components 

’3 4 5’ 

This is new. Name it 


enter components 

T 2 3’ 

This is new. Name it 

I f 

enter components 

’7 6 5’ 

This is new. Name it 

» » 

enter components 

T 9 5’ 

This is new. Name it 

'V 

etc. 



In Figure 11.4, letters are formed out of simpler subsets of the retina. 




96 



Figure 11.4 Formation and naming of retina subsets 


Next, the triplets of the cells and larger retina subsets are combined and named, see 
Figure 11.5. 



















97 







Figure 11.5 Letters formed from lines 


Figure 11.6 presents two projections of the WORLD “LINES” built on the base set of 
nine pixels. 



Figure 11.6 Projections of the WORLD of LINES 

















































































































Figure 11.7 presents the flat and 3D projections of the WORLD of PROTO, built with 
SCALE, node by node from 1 to 31. 


98 



Figure 11.7 Projections of the WORLD of PROTO 


SCALE has some mini-utilities: 

1. If you want to check WORLD for a name, type: link 

2. To see NAMES, type: NAMES (or NM) 

3. To display the 3D world, type plotW 

Of course, the WORLD matrix (WW) can be displayed, too, for example, as sparse 
matrix sparse (WW). 

Program LINK gives the complete spectrum of an old generator, listing its name and all 
its entries in the WORLD, i.e., downward and upward connected generators, for example: 


»li nk 


NAME to check 'x' 

/ \ 


Also, for more complex combinations: 






















»li nk 

NAME to check 
/ E L 0 X [ \ 


99 


’K’ 


»link 

NAME to check V 

! — 3 7<>KX[\]_|~ 


»li nk 

NAME to check 

/1 23 <>EOT [\] 


The generator space for the WORLD in figure 11.6 is 

NAMES = 112345678 |-\/[]~_XT + ELOK<> (size 28) 

If desired, the spectrum can be split into up and down entries, the entries of new samples 
of an old type can be made, but not in this program. 


Evolution of a MIND where a WORLD is being built is a separate subject. In short, with 
time, most of the WORLD is forgotten, but part of it turns into KNOWLEDGE, i.e., a flat 
WORLD, as in Figure 11.6 (left), is stored in the long-term memory. Probably, the entries 
compete for the place in KNOWLEDGE, too. Even the bees do it. 


100 


Conclusion 


The following propositions have been combined in this paper: 


1. Ulf Grenander: Objects in the mind (“thoughts” or “ideas”) are configurations within the 
framework of Pattern Theory. This thesis, for the first time, was used by Ulf Grenander for the 
groundwork in building a model of the mind that does not ignore its complexity. The model 
demonstrates properties similar to those of chemical systems in equilibrium. 

2. Henri Poincare: Objects in the mind (“thoughts”) compete for the place in consciousness. 

3. Manfred Eigen: Darwinian competition of linear sequences in the sequence space follows 
fonnal chemical kinetics. This proposition can be generalized to configurations in the sense of 
PT. 

4. Chemical kinetics is based on the concept of transition state. This proposition can be 
generalized to irregular configurations in the sense of PT, which closes the cycle of the four 
propositions at a different level and returns us to Proposition 1. 

5. Bourbaki: A simple system can expand as the scale of sets toward unlimited complexity. 
From the concept of the scale of sets, the distinction between Ar- and He-complexity was 
suggested. 

The described models are just some toys to play with while designing a mind that would 
connect AI with NI. Some of their meaningful properties, however, can be seen even at their 
embryonic stage. Thus, they do not require any sweep strategy: they are self-sweeping (or self- 


101 


scanning). If realized as parallel systems of cellular automaton type, they do not require any 
numbering of their generators. Neither do they need to store any large arrays. In other words, 
they are homunculus-free. It is the same as to say that they belong to NI as much as to AI. 

Further, the models can have a limited number of global parameters and an unlimited 
number of simple local nodes. Notably, the size of SCALE, as the size of human brain, does not 
depend on the size of the world it stores, perceives, and processes, while the world expands. This 
means, most important, that they are contractible, metaphorically speaking, in the sense of 
homotopy theory: the large system can be reduced to zero volume without any change in the 
algorithm. Conversely, a small system can be expanded as a homotopy. Even more important, 
the expansion due to the interaction with environment can occur, up to a point, in an autonomous 
way, as self-learning based on the distinction between the old and the new. 


Homunculus wears two hats in AI: as internal operator and external teacher. The models in this 
paper avoid homunculus by making the core of the program so simple that it can be entrusted for 
execution to a single cell in a cellular automaton in the most general sense. 


This treatise, inclusive of such subjects as AI and chemistry, as well as vacuum 
poems, and bees, seems to deal with very different worlds. World is a useful axiomatic 
that cannot be defined. A tentative classification of worlds can be suggested: 


Real 

RW 

natural 

RNW 

artificial 

RAW 


Virtual 

VW 

real 

VRW 

unreal 

UVW 


cleaners, 

concept 







102 


The virtual real world is a simulation based on observable principles, for example, for 
scientific purposes, while the virtual unreal world, for example, a simulated transformation of a 
man into a vampire, is based on arbitrary or highly hypothetical principles. 


Stephen Spielberg’s film Artificial Intelligence, unlike such films as Lord of the Ring, tries to 
preserve realism of the laws of nature for at least half the film, given the basic premise of the plot. 
Both brilliantly told tales, regardless of their message, illustrate the difference between real and 
unreal virtual worlds. At the same time, both preserve realistic patterns of human behavior and 
attitudes. 


The borderlines between the worlds are not strictly defined and a theory of worlds 
regarding the above distinctions could be an independent subject. 


The next steps in moving along the pathways sparsely marked in this paper could be: 

1. The WORLD where generators compete for presence, while fading due to forgetting 
boosted by the lack of retrieval, and being revitalized due to subsequent referrals. This presumes 
a contact with environment. The process of the transfonnation of the world as history (Figure 
12.2) into the world as knowledge (Figure 11.3) is of particular interest. 

2. Simulation of the environment that imposes its own order on the spontaneous activity 
of the mind. This was attempted with PROTO by forced activation of generators and, of course, 
had the expected effect. 

3. Creation of autonomic agents that may have a preset perception systems (vision, 
hearing), but are self-learning in the sense that they build their internal WORLDS without a 
teacher, through distinction between new and old. For example, a baby Roomba can be created, 
which explores the environment and creates its WORLD before going into the stage of maturity 
where success and failure are reinforced an discouraged, accordingly, by “good girl!” and “stop 
it!” 


A preset (inborn) system of perception for vision, for example, could use various similarity 
transformations on two-dimensional lattices. PT provides an ideal theoretic apparatus for that. 



103 


This could generate extralingual pattern outputs, such as, for example, named for convenience in 
English “line, “L-shape,” “human face,” “grass,” etc., ready for entering the WORLD where 
PERMUTATION is, most probably, the only feasible transformation. There is no physical 
movement in the brain comparable with the powerful transformations of the eye movements. 


4. Transition from extralingual WORLD to language, which is, probably, just a reversed 
SCALE: not from inputs to the world but from WORLD to linguistic outputs. Ulf Grenander’s 
model offers well-tilled soil for that. 

5. Fine chemistry of thoughts that follows the mechanism of transition from regular 
thoughts to new regular thoughts through an irregular transition state. 


Pattern kinetics and pattern history seem to be other directions worth exploring. Author 
and Ulf Grenander have made some steps toward pattern theory of history in their unpublished 
manuscript History as Points and Lines, from which Figures 5.3 and 5.4 were reproduced. Some 
rich food for pattern thought can be found in the new book by Bertrand Roehner and Tony Syme 
Pattern and Repertoire in History [22], The Sociology of Philosophies: A Global Theory of 
IntellectualChange by Randall Collins [23] is a striking intellectual adventure in the pattern spirit 
where the competition of ideas is in the focus. Even the title of the first chapter, “ Coalitions in 
the Mind,” gives the taste of the whole, the chapter starting with: 


Intellectuals are people who produce decontextualized ideas. These ideas are meant to be true or 
significant apart from any locality, and apart from anyone concretely putting them into practice. 
A mathematical formula claims to be true in and of itself, whether or not it is useful, and apart 
from whoever believes it. 

Randall Collins [23], p.20 


Another proposition, expressed by the author and Ulf Grenander in History as Points and 
Lines states that the apparatus of the Internet, for the first time in history, opens the possibility to 
actually measure the state of the real world close to the real time, its position on the 



104 


“globescape,” and to follow its transition states. Similar idea was expressed in another highly 
relevant and fundamental book by Bertrand Roehner, Patterns of Speculation, [24], p. xv. Of 
course, the same can be done on any sub-world, such as economics or even arts. The latter offers 
a rich array of irregularities and transient modes. 

Chemical knowledge at any particular moment is an Ar-system, for which history is 
irrelevant. It represents part of the real world of molecular structures known today. As a creation 
in human mind, however, chemistry has a history coded by references in chemical literature, as 
well as in CAS registry. No philosophic position is required to appreciate this fact because both 
chemistry and its history are observable objects. The collective human mind builds both 
systems, day by day. Chemistry here is just an example, and AI or football can be substituted for 
it. 

Traveling back to the small but dense world of ancient AI, we may discover there a 
richness of vision lost today among the vast expanses. 


ACKNOWLEDGEMENT: Author is deeply indebted to Ulf Grenander for generous 
support, encouragement, and criticism, as well as for great lessons of pattern thinking, MATLAB 
virtuosity, and intellectual independence. 


MINDSCALE MATLAB codes are available from: APPENDIX to Molecules and 
Thoughts 

doc pdf zip ( http://spirospero.net/mindscale-codes .doc , .pdf, or .zip) 



105 


REFERENCES 


See also http://spirospero.net/complexity.htm 

1. Source: Ulf Grenander, http://www.dam.brown.edu/ptg/REPORTS/mind.pdf 

2. Ulf Grenander, General Pattern Theory: a Mathematical Study of Regular Structures. 
Oxford: Clarendon Press, 1993 

3. Ulf Grenander, Elements of Pattern Theoiy. Baltimore: Johns Hopkins University Press, 
1996. 

4. Ulf Grenander, Lectures in Pattern Theory. Vols 1-3. Berlin: Springer-Verlag, 1976-1981. 

5. Christoph Adami, Introduction to Artificial Life, Berlin: Springer-Verlag, 1988. 


Some Alife web sites: 

http://www.alcyone.com/max/links/alife.html 

http://www.lalena.com/ai/ 

http://www.alife.org 

6. Source on Roomba: http://www.hammacher.com/publish/66632.asp 

7. Sources on Grey Walter: www.csulb.edu/~wmartinz/rssc/newsletters/may99.pdf 

www. soc .uiuc. edu/ faculty/pickerin/tortoises .pdf. 


8. Carl G. Looney, Pattern Recognition Using Neural Networks, New York: Oxford University 
Press, 1997, p.221. 


9. CAS substance counter: http://www.cas.org/cgi-bin/regreport.pl 



106 


CAS site: http://www.cas.org/substance.html. 

10. Nicolas Bourbaki, Elements of Mathematics: Theory of Sets, Addison-Wesley, originally 
published by Hermann (Paris), 1968. 

11. John C. Kotz, and Paul Treichel, Jr., Chemistry and Chemical Reactivity, Saunders College 
Publishing, 1999, p. 396 

12. W. Ross Ashby, Design for a Brain: The Origin of Adaptive Behavior, 2nd Ed., New York: 
Wiley, 1960 (originally, 1952). 

13. W. Ross Ashby, An Introduction to Cybernetics, London: Chapman & Hall, 1956/1964. 

(A short gracious bio: http://www.isss.org/lumashby.htm) 

14. Leon Festinger, A Theory of Cognitive Dissonance, Stanford: Stanford University Press, 
1962. 

15. Source: http://www.soc.uiuc.edu/faculty/pickerin/tortoises.pdf 

16. M. Eigen, Selforganization of Matter and the Evolution of Biological Macromolecules, Die 
Naturwissenschaften, 58, 465-522 (1971). 

17. M. Eigen, The Hypercycle, ibid, 64, 541-565 (1977), 65, 7-1 (1987). 

Also published as separate book: Manfred Eigen and Peter Schuster, The Hypercycle - A 
Principle of Natural Self Organization, Berlin: Springer-Verlag, 1979. 

18. Source: http://www.kaytat.com/lyrics/lets_do_it.html 

19. H. Poincare, The Foundations of Science, Lancaster, PA: The Science Press, 1946, p. 393. 

20. Antonio Damasio, The Feeling of What Happens: Body and Emotions in the Making of 
Consciousness, NY, San Diego, London: Harcourt, Brace, & Company, 1999. 



107 


21. H. D. Alexander, Ed., Leibniz-Clarke Correspondence, Manchester University Press, 1956, 
p.25. Last reprinted in 1998. 

22. Bertrand M. Roehner, Tony Syme, Patterns and Repertoire in History, Cambridge, MA, 
London, England: Harvard University Press, 2002. 

23. Randall Collins, The Sociology of Philosophies: A Global Theory of Intellectual Change. 
Cambridge, Mass.: Belknap Press, 1998. 

24. Bertrand Roehner, Patterns of Speculation, Cambridge: Cambridge University Press, 2002. 



APPENDIX to: 

Molecules and Thoughts 

Pattern Complexity and Evolution in Chemical Systems and the Mind 
http://spirospero.net/mindscale.pdf ; http://spirospero.net/MINDSCALE.pdf 

http://www.scribd.eom/doc/11576667/-Yuri-Tarnopolsky-MOLECULES-AND- 

THQUGHTS 


Yuri Tarnopolsky 
PROTO 

WORKSPACE: PROTO.mat 
http://spirospero.net/PROTO.mat 

The MATLAB program PROTO requires workspace PROTO.mat, is stored as scripts 
proto.m (main), protol.m, and proto2.m, and generates a figure of the tree with red 
asterisks meaning acts of selection (actons). 

To activate the program, save workspace (matrices P, T, and GXY ) and the scripts. The 
main program offers a choice of starting from equal probability distribution (proto 1) or 
continuing from the previous workspace, parameters, and probability distribution 
(proto2). 

proto.m 

%PROTO; script proto; can be renamed 

S=input(' 1 to start, 2 to continue '); 

if S==l 
proto 1 
end 

if S==2 
proto2 
end 








protol.m 

%PR0T01; TO START; file protol.m; workspace proto.mat 
%SETS PARAMETERS AND STARTS FIRST BATCH 
% can be followed by proto2.m 
%Workspace: P(32xll), T(31x31), GXY (31,2). 

%Initial settings: 

H=0.7; F=0.9; C=1; gup=0.5; w=2; a=2; t=0.1; 

%%%%%%%%PARAMETERS SECTION %%%%%%%%%%%%%%%% 

n=input ('Enter n ');%n=20; 
a=input('Enter a');% a=2; 

%b=input('Enter b '); %Optional second acton 
F=input('Enter F' );%F=0.8; 

C=input('Enter C ');%C=1; 

H=input('Enter H ');%H=0.8; 

%gup=input('Enter gup '); 
gup=0.5; 

%w=input('Enter w'); 
w=2; 

gdown=w*gup; 

%gdown=input('Enter gdown'); 

%gdown=2; 

%t=input('Enter pause t' ); 

%t=0.1; 

tra=[]; % tra: trajectory 

ar=0; %ar: arity 

aa=a; %aa: cell for the previous acton; 

%%%%%%%%%%% END of PARAMETERS SECTION 

%%%%%%%%%%%%%%%%% 


PAR 12=['a=',num2str(a)]; 

text (0.01,0.3,PAR12); %displays starting acton 


PX=P( 1:31,2)/14; PY=P(l:31,3)/8.5; P(32,7)=0; 
%factors 14 and 8.8 to fit the figure 


%COLUMNS in matrix P: 1: Node #; 2,3: Node coordinate, 4-6: Node neighborhood; 



% 7: Probability,p; 8: Influence^; 9: Active node,A; 10: Memory,M; 
%11: Spare column. Line 32:zeros. 


%INITIAL MATRIX P. Equal probabilities. 

P( 1:31,7)=l/31; P(:,8:ll)=0; %Columns 7 to 11 contain variables 

% Data for the acton: 

P(a,7)=C; P(a,9)=l; P(a,10)=C;%ACTON ENTERED 

AA=zeros(n,l); %array for the trajectory prepared; 

%contains subsequent actons 

%%%%%%%%%%%%%%%% DISPLAY SECTION %%%%%%%%%%%%%%% 

%CONNECTOR GRAPH DISPLAY 
gplot(T,GXY); 

%NODE NUMBERS DISPLAY; 

for i= 1:31 
No=int2str(i); 
text(PX(i), PY(i), No ); 
end 

%ACTON ASTERISK DISPLAY 

text(PX(a),PY(a)+0.01,'*','’color’, [0 1 0],’FontSize’,24); 

%text(PX(b),PY(b)+0.01 ’color’, [0 1 l],’FontSize’,24); 

% b is the optional second acton 

%PARAMETERS DISPLAY 

PARl=[’n= ’,num2str(n)]; PARI 1=[’H= ’, num2str(H)]; 

PAR2=['C= ’,num2str(C),'; F= ’,num2str(F)]; 

PAR3=[’gup= ’,num2str(gup),'; gdown= ’,num2str(gdown)]; 
text (0.01,0.25,PARI); 
text (0.01,0.2,PAR2); 

text(0.07, 0.28,’ *',’color',[0 1 0],'FontSize’,24); 
text(0.7,0.15,PAR3);text (0.01, 0.15,PARI 1); 

%%%%%%%%%%%%%%%%% END OF DISPLAY SECTION 

%%%%%%%%%%%%%% 

%PROTO CONTINUES, PR STARTS; PR adds n cycles. 


forjj=l:n 


% START MEMORY 
for ii=l:31 
P(ii, 10)=F*P(ii, 10); 



end 

P(a,10)=F*C; P(a,7)=C;%special values for the acton 


% END MEMORY 
%START INFLUENCE 

P(a,7)=H*C; %to exert influence 

P(:,8)=0; 
for 1=1:31 

for j=4:6 

ifP(I,l)<P(I,j) 
g=gdown; else g=gup; 

end 

if I< 16 ar=l; end 
if I> 16 & I<31 ar=l/3; end 
if I==31 ar=l/2; end 

%P (1,8)=P(1,8)+P(P (I,j),7) * g* ar; 

%previous influence remembered 
P(I,8)=P(P(I,j),7)*g*ar; 
end 

end 

%END INFLUENCE 
%START PROBABILITY 

P(a,7)=C; %just a fonnality 
for 1=1:31 P(I,7)=P(I,7)+P(I,8)+P(1,10); end 

%%%%%%%%%%%%%%% 

%END PROBABILITY 

% START ACTON SELECTION 

P(a,7)=0; %cannot be selected again 
SUM=0; 

ss=0; S=P( 1:31,7); SUM=sum(S); P(1:31,7)=P(1:31,7)/SUM; 
r=rand; 

for i=l:31, ss=ss+P(i,7); 
if ss>r, a=i; break, 
end 
end 

P(:,9)=0;P(a,9)=l; %infonnation about acton; not used 
P(aa,7)=C; %probability for the previous acton; 

% END ACTON SELECTION 



% ACTON ASTERISK DISPLAY; Randomization of the position of ASTERISK. 
%This is done redundantly,with possible other uses in mind 
%COORDINATES 
LXx=P( 1:31,2);LX=LXx'/14; 

LYy= P(l:32,3);LY=LYy'/8.5; 

for 1=1:31 

%RANDOM SHIFT OF X 
u=rand/40; 

%RANDOMIZATION OF SIGN 
uu=rand; if uu<=0.5, uuu=l; else uuu=-l; end 
u=u*uuu; LX(l,I)=LX(l,I)+u; end 
for 1=1:31 

%RANDOM SHIFT OF Y 
u=rand/40; 

%RANDOMIZATION OF SIGN 
uu=rand; if uu<=0.5, uuu=l; else uuu=-l; end 
u=u*uuu; LY(l,I)=LY(l,I)+u; end 

text(LX(a), LY (a), ’color',[1 0 0],’FontSize’,12); 

%END ASTERISK DISPLAY 

% END OF PROTO FOLLOWS 
pause (t); 

a;AA(jj,l)=a; tra=AA’; 
end 

PP=zeros(31,2);PP(:, 1 )=P( 1:31,1); PP(:,2)=P( 1:31,7); 
disp (’type PP for probability distribution ’); 
disp (’type tra for trajectory’); 


proto2.m 


%PR0T02; TO CONTINUE; file proto2.m; workspace proto.mat 
%CONTINUES WITH PREVIOUS WORKSPACE AND 
%PROBABILITY DISTRIBUTION; Prompts for n 
%%PR0T02 is done redundant,with possible other uses in mind 

•Workspace: P(32xll), T(31x31), GXY (31,2). 

%%%%%%%%PARAMETERS SECTION %%%%%%%%%%%%%%%% 
n=input (’Enter n ’); %n=20; 


%a=input(’Enter a ’);% a=2; %F=input(’Enter F ’ );%F=0.8; 
%C=input(’Enter C ’);%C=1; %H=input(’Enter H ’);%H=0.8; 



%gup=input('Enter gup '); %gup=0.5; 

%w=input('Enter w ');%w=2; %gdown=w*gup; 
%t=input('Enter pause t' );%t=0.1; 

tra=[]; % tra: trajectory 

%%%%%%%% AAA END of PARAMETERS SECTION 

AAA %%%%%%%%%%%%%% 


%COLUMNS in matrix P: 1: Node #; 2,3: Node coordinate, 4-6: Node neighborhood; 
% 7: Probability,p; 8: Influence^; 9: Active node,A; 10: Memory,M; 

%11: Spare column. Line 32:zeros. 

%INITIAL MATRIX P. Equal probabilities. 

%P( 1:31,7)=1/31; P(:,8:ll)=0; %Columns 7 to 11 contain variables 

% Data for the acton: 

P(a,7)=C; P(a,9)=l; P(a,10)=C;%ACTON ENTERED 

AA=zeros(n,l); %array for the trajectory prepared; 

%contains subsequent actons 

%%%%%%%%%%%%%%%% DISPLAY SECTION %%%%%%%%%%%%%%% 

%CONNECTOR GRAPH DISPLAY 
%gplot(T,GXY); 

%NODE NUMBERS DISPLAY; 

%for i=l:31 
%No=int2str(i); 

%text(PX(i), PY(i), No); 

%end 

%NEW ACTON ASTERISK DISPLAY (THE ACTON LEFT BY PROTO 1) 
%text(PX(a),PY(a)+0.01 'color', [0 1 0],'FontSize’,24); 

rectangle (’Position’, [0, 0.1, 0.12, 0.25], 'FaceColor', 'w') 
rectangle (’Position’, [0.7, 0.1, 0.3, 0.1], 'FaceColor', 'w') 

%PARAMETERS DISPLAY 
PARl=['n= ’,num2str(n)]; 

PARI 1=['H= ’, num2str(H)]; 

PAR2=['C= ’,num2str(C)]; 

PAR22=['F= ’,num2str(F)]; 

PAR3=[’gup=',num2str(gup),'; gdown=',num2str(gdown)]; 

text (0.01,0.27,PARI); 

text (0.01,0.22,PAR11); 

text (0.01,0.17,PAR2); 

text (0.01,0.12,PARI 1); 



text(0.72,0.15,PAR3); 

%%%%%%%%%%%%%% AAA END OF DISPLAY SECTION 
AAA %%%%%%%%%%% 


%PROTO CONTINUES, adds n cycles. 


forjj=l:n 


% START MEMORY 
for ii=l:31 
P(ii, 10)=F*P(ii, 10); 
end 

P(a,10)=F*C; P(a,7)=C;%special values for the acton 
% END MEMORY 
%START INFLUENCE 


P(a,7)=H*C; %to exert influence 


P(:,8)=0; 
for 1=1:31 

for j=4:6 

ifP(I,l)<P(I,j) 
g=gdown; else g=gup; 

end 

if I<16 ar=l; end 
if I> 16 & I<31 ar=l/3; end 
if I==31 ar=l/2; end 

%P(I,8)=P(I,8)+P(P(I,j),7)*g*ar; 

%previous influence remembered; optional 

P(I,8)=P(P(I,j),7)*g*ar; 

end 

end 

%END INFLUENCE 

%START PROBABILITY 

P(a,7)=C; %just a fonnality; will be zero anyway 

for 1=1:31 P(I,7)=P(I,7)+P(I,8)+P(I,10); end 


%%%%%%%%%%%%%%% 

%END PROBABILITY 



% START ACTON SELECTION 

P(a,7)=0; %cannot be selected again 
SUM=0; 

ss=0; S=P( 1:31,7); SUM=sum(S); P(1:31,7)=P(1:31,7)/SUM; 
r=rand; 

for i=l:31, ss=ss+P(i,7); 
if ss>r, a=i; break, 
end 
end 

P(:,9)=0;P(a,9)=l; %infonnation about acton; not used 
P(aa,7)=C; %probability for the previous acton; 

% END ACTON SELECTION 

% START ACTON ASTERISK DISPLAY; 

%Randomize the position of ASTERISK. 

%COORDINATES 
LXx=P( 1:31,2);LX=LXx714; 

LYy= P(l:32,3);LY=LYy'/8.5; 

for 1=1:31 

%RANDOM SHIFT OF X 
u=rand/40; 

%RANDOMIZATION OF SIGN 
uu=rand; if uu<=0.5, uuu=l; else uuu=-l; end 
u=u*uuu; LX(l,I)=LX(l,I)+u; end 
for 1=1:31 

%RANDOM SHIFT OF Y 
u=rand/40; 

%RANDOMIZATION OF SIGN 
uu=rand; if uu<=0.5, uuu=l; else uuu=-l; end 
u=u*uuu; LY(l,I)=LY(l,I)+u; end 

text(LX(a), LY (a), ’color’,[1 0 0],'FontSize’,12); 

%END ASTERISK DISPLAY 

% END OF PROTO FOLLOWS 
pause (t); 

a;AA(jj,l)=a; tra=AA’; 
end 

PP=zeros(31,2);PP(:, 1 )=P( 1:31,1); PP(:,2)=P( 1:31,7); 
disp ('type PP for probability distribution '); 
disp ('type tra for trajectory'); 



BIRDS 


WORKSPACE: BIRD.mat 

http ://spirospero .net/BIRD .mat 

To use BIRDS, copy the workspace BIRDS.mat, copy scripts bf.m , bf2.m, and bq.m. 
Start by entering bf. After that, use bf2. For bf2, n=l to 10 is recommended. In this 
mode, drastic changes of direction can be observed even after one cycle. However, only 
after a certain number of cycles (50-100), the system enters the collective mode. In terms 
of physics, it reaches a steady state far from equilibrium. At C=l, coherence is low, but at 
C=2 it is clearly seen. 


bf.m 


%SCRIPT bf 
%Initial parameters. 

nf=0; b=0; q=0; gg=0; gin=0; gout=0;gd=0;q=0; w=0; 
SUM=0; s=0; ss=0; S=0; nf=0; gr=0; 

%%%%%%%%. Input: 
n=5; d=16; 

F=0.5; C=1;H=1; gin=l; gr=0.8; gd=l; 
%%%%%%%% 
gout=gin*gr; 

FL(:,:,8)=l/32; 

FL(:,33,:)=0; 

FL(37,:,:)=0; 

FL(:,:,7)=0; 

FL(l:36,33,7)=d; 

FL(1:36,33,9)= C; 

FL(:,:,9:10)=0; 

L(l:36); 






%START 

fornf=l:n, for b=l:36 


%MEMORY 
% [Forget] 

FL(b, :,9)= F* FL(b,:,9); 

%[update last direction] 

FL(b, FL(b,33,7), 9)=C; 

FL(b, FL(b,33,7), 8)=H; 

%INFLUENCE, INDIVIDUAL 
FL(b,:, 10)=0; 
for ii=l:32 

gf=abs(ii-FL(b,33,7)); 
if gf>17, gf=32-gf; end 
%[gf=distance between directions ] 
ifgf==l 

FL(b,33,10)=FL(b,33,10)+ FL(b,ii,8)*gd; 
end 

end; 

% INFLUENCE, FLOCK 

INF=0; 

for w=l:32 

for q=2:9 


if FLA(b,l)< FLA(b,q) 
gg=gin; else gg=gout; end 

%[read neighbor’s b] 
x=FLA(b,q); 

%x can be 37 

%[read it's d(past){7}] 
xx=FL(x,33, 7); 

%[read it's P] 

PPP=FL(b,w, 8); 

%[correct this Infl {10}] 

PPP=PPP*gg; 

INF=INF + PPP; 
end 

FL(b,w,10)= FL(b, w, 10)+PPP; 
end 



%PROBABILITY 
for ii=l:32 

FL(b,ii, 8)=FL(b,ii,8)+FL(b,ii,9)+FL(b,ii,10); 
end 

FL(b, FL(b, 33,7), 8)=C; 

%SELECTION 

SUM=0; ss=0;S=FL(b,l:32, 8); SUM=sum(S); 

FL(b,:, 8)=FL(b,:,8)/SUM; 
r=rand; 

for i=l:32, ss=ss+FL(b,i, 8); 
if ss>r, d=i; break, 
end 
end 

%write new d 

FL(b,33,6)=d; 

end 

FL(:,33,7)=FL(:,33,6);FL(b,33,6)=0; 

A=FL(:,33,7);gplot(TOP,COORD); 

AH=['H = ',num2str(H)];AC=['C = ’,num2str(C)]; 
AF=['F = ’,num2str(F)];An=[’n = ’,num2str(n)]; 
Ag=['gd= ',num2str(gd)];Agin=[’gin = ',num2str(gin)]; 
Agout=[’gout= ’,num2str(gout)]; 
text (1.5,5.5,AC); text (1.5,4.5,AH); 
text (1.5,3.5,AF);text (1.5,2.5,Ag);text (1.5,1.5,Agin); 
text (3.4,1.5,Agout); text(3.5,2.5,An); 
for u=l:36 

X=COORD(u,l); Y=COORD(u,2); 

AA=A(u);AAA=[’ ',num2str(AA)]; 
text(X,Y,AAA,’color’,[1 0 0],’FontSize’,14); 

end 

end 

%quiver plot: bq.m 

%for 11=1:36, L(ll)=2*pi*A(ll)/32;end 

%sL=sin(L);cL=cos(L);sL(37)=[];cL(37)=[]; 

%quiver(COORD(:,l),COORD(:,2),sL’,cL’) 


%pause(2); 

bq; %if blocked, the grid output will be displayed 

bq.m 



for 11=1:36, L(ll)=2*pi*A(ll)/32;end 
sL=sin(L);cL=cos(L);sL(37)=[];cL(37)=[]; 
quiver(COORD(:, 1 ),COORD(: ,2),sL',cL') 


bf2.m 

%%%%%%%%. Input: 

%n=3; d=16; F=0.3; C=2;H=1; gin=l; gr=0.8; gd=l; 
%%%%%%%% 

%START 

gout=gin*gr; 

FL(:,: , 8)=l/32; 

FL(:,33,:)=0; 

FL(37,:,:)=0; 

FL(37,:,:)=0; 

FL(l:36,33,7)=d; 

FL(1:36,33,9)= C; 

FL(:,:,9:10)=0; 

FL(l:36,33,7)=d; 

L(l:36); 

fornf=l:n, for b=l:36 

%MEMORY 
% [Forget] 

FL(b, :,9)= F* FL(b,:,9); 

%[update last direction] 

FL(b, FL(b,33,7), 9)=C; 

FL(b, FL(b,33,7), 8)=H; 

%INFLUENCE, INDIVIDUAL 
FL(b,:, 10)=0; 
for ii=l:32 

gf=abs(ii-FL(b,33,7)); 
if gf>17, gf=32-gf; end 
%[gf=distance ] 
ifgf==l 

FL(b, 3 3,10)=FL(b ,33,10)+ FL(b,ii,8)*gd; 
end 


end; 




% INFLUENCE, FLOCK 

INF=0; 

for w=l :32 

for q=2:9 


if FLA(b,l)< FLA(b,q) 
gg=gin; else gg=gout; end 

%[read neighbor’s b] 
x=FLA(b,q); 

%x can be 37 

%[read it's d(past){7}] 
xx=FL(x,33, 7); 

%[read it's P] 

PPP=FL(b,w, 8); 

%[correct this Infl {10}] 

PPP=PPP*gg; 

INF=INF + PPP; 
end 

FL(b,w,10)= FL(b,w,10)+PPP; 
end 

%PROBABILITY 
for ii=l:32 

FL(b,ii, 8)=FL(b,ii,8)+FL(b,ii,9)+FL(b,ii,10); 
end 

FL(b, FL(b, 33,7), 8)=C; 

%SELECTION 

SUM=0; ss=0;S=FL(b,l:32, 8); SUM=sum(S); 

FL(b,:, 8)=FL(b,:,8)/SUM; 

r=rand; 

for i=l:32, ss=ss+FL(b,i, 8); 
if ss>r, d=i; break, 
end 
end 

%write new d 

FL(b,33,6)=d; 

end 

FL(:,33,7)=FL(:,33,6);FL(b,33,6)=0; 

A=FL(:,33,7);gplot(TOP,COORD); 



for u=l:36 

X=COORD(u,l); Y=COORD(u,2); 

AA=A(u);AAA=[’ ’,num2str(AA)]; 
text(X,Y,AAA,'color',[1 0 0],’FontSize',14); 
end 

%DISPLAY PARAMETERS 

AH=['H = ',num2str(H)];AC=['C = ’,num2str(C)]; 

AF=['F = ’,num2str(F)];An=['n = ’,num2str(n)]; 

Ag=['gd= ’,num2str(gd)];Agin=['gin = ’,num2str(gin)]; 

Agout=['gout= ’,num2str(gout)]; 

text (1.5,5.5,AC); text (1.5,4.5,AH); 

text (1.5,3.5,AF);text (1.5,2.5,Ag);text (1.5,1.5,Agin); 

text (3.4,1.5,Agout); text(3.5,2.5,An); 

end 

%QUIVER DISPLAY : bq.m 

%for 11=1:36, L(ll)=2*pi*A(ll)/32;end 

%sL=sin(L);cL=cos(L);sL(37)=[];cL(37)=[]; 

%quiver(COORD(:,l),COORD(:,2),sL',cL') 

%pause(2); 

bq; %if blocked, the grid output will be displayed 


SCALE 


sc.m (main) 

% PROGRAM SCALE, script sc; RELATED: link; space lin.mat 
% 1. INITIALIZE 

S=input(' 1 to start, 2 to continue '); 
nn=input(’enter number of cycles (nn): '); 

disp(' ATTENTION: Interspaced components should be entered') 
disp(' between apostrophes as single string of characters'); 
disp(' '); 

disp(' press any key to continue') ; pause; 

% 2. START 

forn=l:nn, if (S==l)&(n==l), sw=l; W=zeros(sw, 16,8); 

NAMES=['!']; WW=zeros(sw,sw);NM=[ ]; 





end 


% 3. INPUT 
I=input(’enter components '); 

% 4. CONVERT INPUT into PIN; 

PIN=zeros(8,8); word=[];jj=l; I=[I,' ’]; ab=abs(I); lab=length(ab); 
for i=l:lab, if (ab(i))~=32, word=[word,ab(i)]; else 
lw=length(word); PIN(jj,l:lw)=word; word=[l; ji=jj+l; end, end 
% 5. CHECKING novelty of PIN against WORLD 
old=0; wiold=0; comp=[]; % PIN presumed new 

for wi=l:sw %wi: length of World 
% 5.1 CUTTING FLAT SLICE OF WORLD W(wi,:,:) 
fW=zeros(16,8); fW(:,:)=W(wi,:,:); 

% 5.2 COMPARE flat W and PIN as ordered sets % 

eqid=isequal(fW(9:16,:),PIN); if eqid==l, old=l; %PIN is OLD 

disp (’This is old ’),disp(NAMES((wi),:)), % wiold=wi; list=find(WW(:,wiold)); 
leli=length(list); SN=[];for k=ldeli, SN=strvcat(SN,NAMES((leli),:));end 
end, 


% 5.3 COMPARE flat W and PIN as sets 

set=0; 

if isempty(setxor(fW((l:8),:),PIN,’rows')),if old==0, 
if eqid~=l, 

note = ['Looks like W’ ,int2str(wi),': ’, NAMES(wi,:)]; 
disp(note); set=l; end, end,old=l; end 
%%%%%% 
if set==l, 

nnew=input(’ Is it new? 1/0 '); 
if nnew==l,old=0; end, end 
%%%%% 

% 5.4. CHECK for SET MEMBERS to create SPECTRUM; 

eqcomp = intersect(PIN,fW ,’rows'); 
h=size(eqcomp,l) ; test=zeros(l,8); 
for hh=l:h, 

if ((~iscmpty(cqcomp))&(~isequal(eqcomp(hh,:),tcst))), 
comp=[comp,wi];end, end 
end %for wi=l:sw 

spectrum=unique(comp);les=length( spectrum); 

%disp(’spectrum: '); %for r=ldes, disp(NAMES(spectrum(r),:)); end 
%if old==l,SNU=unique(SN,'rows');disp(’OLD spectrum '),disp(SNU),end, 
% 6. IF NEW, EXPAND THE WORLD: 

% 6.1. NAME THE NEW ENTRY 



if old==0, NAME=input(’This is new. Name it ') ; 

NAMES=strvcat(NAMES, NAME); 

% 6.2. CREATE NewPIN containing NAME 

NAME=[NAME,'']; NewPIN=zeros(8,8);word=[];jj=l; ab=abs(NAME); 
lab=length(ab); for i=l:lab, if (ab(i))~=32, word=[word,ab(i)];else 
lw=length(word); NcwPINfj j, 1 :lw)=word; word=[];jj=jj+l; end, 
end 

% 6.3.ADD PIN and NewPIN to WORLD 

sw=sw+l; WW(sw,:)=0; WW(:,sw)=0;% PLACE IN THE WORLD 

W(sw,9:16,:)=NewPIN(:,:); W(sw,l:8,:)=PIN(:,:); 

% All 16 lines in WORLD are filled up 

% 6.4. RECORD NEW LINKS 
for 11=1 :les, if ~isempty(spectrum(ll)), 

WW(sw,spectrum(ll))=l; WW(spectrum(ll),sw)=l;end, end, 

end % if old==0, 

%To read an entry from W(x,y,:) %D=nonzeros(W(x,y,:)); D=D’; setstr(D); 

end % for n=l:nn 
NM=NAMES; world=WW; 

disp(’If you want to check WORLD for a name, type: link ') 
disp(’To see NAMES, type: NM '); 
disp(’To display the 3D world, type plotW '); 


sct.m 

% PROGRAM SCALE, script sc; RELATED: link; space lin.mat 
% 1. INITIALIZE 

SHnputf I to start, 2 to continue ');nn=input(’nn '); 

% 2. START 

forn=l:nn, if (S==l)&(n==l), sw=l; W=zeros(sw, 16,8); 

NAMES=['!']; WW=zeros(sw,sw); 
end 

% 3. INPUT 

I=input('type interspaced components as single character string '); 


% 4. CONVERT INPUT into PIN; 

PIN=zeros(8,8); word=[];jj=l; I=[I,' ’]; ab=abs(I); lab=length(ab); 



for i=l:lab, if (ab(i))~=32, word=[word,ab(i)]; else 
lw=length(word); PIN(ji, I :lw)=word; word=f]; jj=ii+l; end, end 
% 5. CHECKING novelty of PIN against WORLD 
old=0; wiold=0; comp=[]; % PIN presumed new 

for wi=l:sw %wi: length of World 
% 5.1 CUTTING FLAT SLICE OF WORLD W(wi,:,:) 
fW=zeros(16,8); fW(:,:)=W(wi,:,:); 

% 5.2 COMPARE flat W and PIN as ordered sets % 

eqid=isequal(fW(9:16,:),PIN); if eqid==l, old=l; %PIN is OLD 
disp (’This is old ’),disp(NAMES((wi),:)), % wiold=wi; list=find(WW(:,wiold)); 
leli=length(list); SN=[];for k=ldeli, SN=strvcat(SN,NAMES((leli),:));end 
end, 


% 5.3 COMPARE flat W and PIN as sets 

set=0; 

if isempty(setxor(fW((l:8),:),PIN,’rows')),if old==0, 
if eqid~=l, 

note = ['Looks like W’ ,int2str(wi),': ’, NAMES(wi,:)]; 
disp(note); set=l; end, end,old=l; end 
%%%%%% 
if set==l, 

nnew=input(' Is it new? 1/0 '); 
if nnew==l,old=0; end, end 
%%%%% 

% 5.4. CHECK for SET MEMBERS to create SPECTRUM; 

eqcoinp = intersect(PIN,fW ,’rows'); 
h=size(eqcomp,l); test=zeros(l,8); 
for hh=l:h, 

if ((Msempty(eqcomp))&(~isequal(eqcomp(hh,:),test))), 
comp=[comp,wi];end, end 
end %for wi=l :sw 

spectrum=unique(comp);les=length( spectrum); 

%disp(’spectrum: '); %for r=ldes, disp(NAMES(spectrum(r),:)); end 
%if old==l,SNU=unique(SN,’rows');disp(’OLD spectrum '),disp(SNU),end, 
% 6. IF NEW, EXPAND THE WORLD: 

% 6.1. NAME THE NEW ENTRY 

if old==0, NAME=input(’This is new. Name it ') ; 

NAMES=strvcat(NAMES, NAME); 

% 6.2. CREATE NewPIN containing NAME 

NAME=[NAME,' ’]; NewPIN=zeros(8,8);word=[];jj=l; ab=abs(NAME); 
lab=length(ab); for i=l:lab, if (ab(i))~=32, word=[word,ab(i)];else 
lw=length(word); NewPIN(jj,l:lw)=word; word=[];jj=jj+l; end, 



end 


% 6.3.ADD PIN and NewPIN to WORLD 

sw=sw+l; WW(sw,:)=0; WW(:,sw)=0;% PLACE IN THE WORLD 

W(sw,9:16,:)=NewPIN(:,:); W(sw,l:8,:)=PIN(:,:); 

% All 16 lines in WORLD are filled up 

% 6.4. RECORD NEW LINKS 
for 11=1 :les, if ~isempty(spectrum(ll)), 

WW(sw,spectrum(ll))=l; WW(spectrum(ll),sw)=l;end, end, 

end % if old==0, 

%To read an entry from W(x,y,:) %D=nonzeros(W(x,y,:)); D=D'; setstr(D); 

end % for n=l:nn 
NAMES; world=WW; 

disp(’If you want to check WORLD for a name, type: link ’) 
disp(’To see NAMES, type: NM ’); 
disp(’To display the 3D world, type plotW ’); 


flatW.m 

%script flatW 
cla 

om=length(WW); 

PX=zeros(om,l);PY=zeros(om,l);XYP=zeros(om,2); 
angles=(2*pi./100). *[0:100]; 
plot(cos(angles),sin(angles)) 
hold on 

angles=(2 *pi/om). * [0: om-1 ]; 
for j=l:om 

%plot(cos(angles(j)).*[l 1.5],sin(angles(j)).*[l 1.5]) 
text(cos(angles(j)),sin(angles(j)),[num2str(j),' ’,NAMES(j,:)]), 
PX(j)=cos(angles(j));PY(j)=sin(angles(j)); end 
XYP(:,1)=PX; XYP(:,2)=PY; gplot(WW,XYP); 

%end of script flatW 


link.m 



% CHECKING THE WORLD FOR NAME; SCRIPT: link 
I=input(’NAME to check'); 

CI=cellstr(I); 

SC=nonzeros(strmatch(CI,NAMES));%find NAME’S numbers 
LINKS=find(WW(:,SC)); links=LINKS'; 

LL= [];%list of links 
NAMELIST=[]; 
for i=l:length(SC) 

LL= [LL,(fmd( WW(:, SC(i))))']; 
end 

lel=length(LL); for r= 1:lei, NAM=(NAMES(LL(r),:)); 
NAMELIST=strvcat(NAMELIST,NAM);end, 
NAMELIST=unique(NAMELIST,’rows'); disp(NAMELIST) 

plotW.m 

%script: plotW, plot world 
close; 

om=length(WW); 

PX=zeros(om,l);PY=zeros(om,l); XYP=zeros(om,3); PZ=[l:om]; PZ=PZ'/om; 
angles=(2 *pi/om). * [0: om-1 ]; 
for j=l :om 

PXO) == cos(angles(j));PY(j)=sin(angles(j)); 

end 

XYP(:,1)=PX; XYP(:,2)=PY; XYP(:,3)=PZ; 

[iW,jW]=find(WW); 

liW=length(iW); 

MW=zeros(liW,2); 

MW(:,l)=iW; MW(:,2)=jW; 
for i=l:liW 

line([XYP(MW(i, 1), 1 ),XYP(MW(i,2), 1)],... 

[XYP(MW(i, 1 ),2),XYP(MW(i,2),2)],... 
[XYP(MW(i,l),3),XYP(MW(i,2),3)]); 
end 

hold on, 
stack 


stack.m 


%script: stack 

ZZ=zeros(l,41); 

tt=0; 



for zz=0:0.1:l 

t = 0:pi/20:(2*pi); ZZ(l,:)=zz; tt=ZZ; 
plot3(sin(t),cos(t),tt),hold on, 
grid on 
axis square 
tt=tt+l; 
end 





First draft. August 2003. 


TRANSITION STATE IN PATTERNS OF HISTORY 


Yuri Tamopolsky 


ABSTRACT 


A review of literature shows that a diaspora of natural scientists interested in history has 
been forming for some time around the legacy of Rashevsky, Richardson, and Tilly. 
Analytical history of Bertrand Roehmer and Tony Syme, as well as the study of patterns 
of military conflicts by Peter Brecke, have a potential of becoming centers of the 
“naturalization” of historical research. 

The existing formal approaches to complex systems create a conundrum of the 
use of closed mathematical structures for representing open irreversible systems. Pattern 
Theory (Ulf Grenander) is suggested as another entry in the inventory of methods in this 
area. 

Pattern Theory, with its atomistic realism and preservation of semantics, is 
uniquely positioned for developing a general representation of Very Complex Open 
Systems, such as life, mind, and society. The focus of the application of Pattern Theory to 
history is transition state characterized by its irregularity and, therefore, instability. 

A possibility to treat a segment of history as a quasi-chemical structural 
transformation through alternating stable and irregular states is illustrated on the example 
of the expedition of Darius against the Scyths. 



2 


CONTENTS 


1. Preamble 2 

2. The intent of the paper 4 

3. Historical sketch 6 

4. History and thermodynamics 20 

5. Pattern Theory 23 

6. Chemical kinetics: the short run 28 

7. Equilibrium, kinetics, and catalysis in history 33 

8. Illustration: Darius against Scyths 35 

9. Conclusion 44 

10 References 45 


1. Preamble 


The divide between the two cultures—sciences and humanities—appears to be slowly closing 
due to two trends. One is the influence of the pervasive science and technology on the creative 
process in humanities. An installation of the artist Janine Antoni who weaves a recorded pattern 
of her own Rapid Eye Movements on a traditional loom is an example. The other trend is the 
unstoppable spread of sciences, enhanced by mathematics and computers, over the traditional 
areas of humanities. Social sciences seem to be an emerging Isthmus of Panama between the two 
previously separated continents. Compare three quotations, one of a sociologist, the second of a 
physicist, and the third of a fiction writer. 


I should like to deal with the promise of social science in terms of three prospects which seem to 
me both possible and desirable for the twenty-first century: the epistemological reunification of 
the so-called two cultures, that of science and the humanities; the organizational reunification and 
redivision of the social sciences; and the assumption by social science of centrality in the world of 



3 


knowledge (Wallerstein, 1998). 

Today we are becoming more and more conscious of the fact that on all levels, from elementary 
particles to cosmology, randomness and irreversibility play an ever-increasing role. Science is 
rediscovering time. This obviously introduces a new dimension into the old problem of the two 
cultures, science and the humanities (Prigogine, 1983). 


But Denise left the kitchen and took the plate to Alfred, for whom the problem of existence was 
this: that, in manner of a wheat seedling thrusting itself up out of the earth, the world moved 
forward in time by adding cell after cell to its leading edge, piling moment on moment, and to 
grasp the world even in its freshest, youngest moment provided no guarantee that you’d be able to 
grasp it again a moment later (Franzen, 2001, p. 66). 

In the same book Jonathan Franzen uses adjective “Cherenkov blue,” comparing an eye 
color to a physical effect that can be observed only in some nuclear reactors. 

The above excerpts shall usher us into the area that has been attracting attention of 
philosophers since times immemorial and physicists, mathematicians, and computer scientists for 
decades: history. What can hard sciences say about human history? This time it is a take of a 
chemist. 

In his The Rise and Fall of the Great Powers Paul Kennedy (1987) discussed the reasons 
why no great power in the last 500 years of history was able to maintain its status. Neither was 
any great empire of the earlier times, although at the peak of power the decline seemed for most 
observers unthinkable. In his subsequent book Kennedy (1993) reiterated and emphasized 
numerous tensions that could threaten the current American status. 

Kennedy’s earlier book was written in before the collapse of the USSR and the later one 
before the war on terrorism and the rise of anti-Americanism. Nevertheless the aforementioned 
conclusion about the transient character of economic power and military domination seems to be 
a regularity of history. If history comes as a surprise, then one day we might be surprised by a 
great power, remaining indefinitely unchallenged, as if history ended its regular course, in 
accordance with Francis Fukuyama’s (1992) conclusions. If history had the invariance analyzed 
by Kennedy and obvious from any textbook of history, the pattern of the rise and decline would 
persist. The problem with this reasoning is that we have no means to supply any facts whether in 
support or in refutation because they reside in the future. In the absence of facts, a theory is a 
substitute, but there is no such universally accepted theory. 



4 


In spite of an accumulated library of patterns, warning signs, and wise predictions, 
history always surprises the contemporaries, requiring a post factum explanation. The main 
reason for this Cassandra Effect is the impossibility to tell when the next turn comes. 

Paul Kennedy, emphasizing the difficulty of predicting the future, presented a series of 
arguments pro and contra the thesis that USA was entering a period of decline as a great power. 
The question is: what is the relation between the regularity and surprise in history? What is 
regularity (law, invariance), surprise (irregularity, novelty, invention), and what is a rationale for 
a particular timing of a historical event? 

It may be too early to expect satisfying answers to these questions, but it might be the 
right time to ask in what direction should we look for them. 


2. The intent of the paper 


It would be possible to discuss the above example in search for a consensus if there were some 
theoretical foundations of history. At a certain level of abstraction, the theory would be invariant 
regarding a substitution of Roman Empire for USA or the second Gulf War for the war of 
Persian king Darius the Great with the Scyths. No historian, however, would sacrifice the 
obvious differences. 

If we remove the terms (semantics) from consideration, what remains is relations 
(syntax), which is all that most mathematics is about. A search for such foundations by means of 
natural sciences, which unlike humanities are based on consensus rather than on institutionalized 
dissent (compare with dissent-based philosophy, Collins, 1998), has its own history in the 
context of the study of complex systems. They will be further abbreviated as LMS—life, mind, 
and society—in order to avoid the difficult problem of defining complexity where no consensus 
exists. 

The purpose of this paper is to informally present Pattern Theory (Grenander, 1978- 
2003) as a possible component in scientific study of LMS and history in particular. I am not a 
mathematician to whose natural habitat Pattern Theory would belong but a chemist who 
intuitively perceives it as a kind of meta-chemistry, i.e., theory of atomistic structures. 



5 


My first suspicion that chemistry is a kind of a metaphor for history goes back to the 
1970s, when I, together with many intellectuals in the fonner USSR, saw the Communist system 
as self-destructible and doomed to collapse. The question that nobody could answer—as nobody 
can answer the questions about current great and small powers—was: when? Today practically 
all professionals in the field of history and political science express their surprise that it came so 
soon. For about a decade after 1917, when the Bolsheviks came to power, the observers were 
surprised that it had lasted so long. 

Chemistry studies events in systems with extremely high and incompressible complexity. 
It possesses some uncommon tools to answer questions pertaining both to the nature of possible 
events and their timing. A chemist sees time through bifocal glasses: one lens for long distance 
in time and another for a close-up. Remarkably, it reminds the vision of a historian of the French 
Annales school, which distinguishes between la longue duree (Braudel, 1992) and the short run 
of smaller scale events. This type of historian regards the totality of all factors, such as 
economics, geography, beliefs, etc., and pays attention to numbers. 

The outcome of a chemical transfonnation in the long run is not the same as observed 
with a stopwatch. The fonner inexorably goes toward equilibrium, while the latter is the result of 
the fastest among concunent transformations. Catalysis can selectively speed up some 
transformations. By using the entirely catalytic and therefore fast biochemistry, the living 
organism escapes equilibrium but pays for that by the absence of anything comparable with the 
historical longue duree : the “great power” of an individual organism always collapses. 

Bertalanffy (1968) defined system as a set of units with relationships between them, 
which is the same as to say that a “systemic” system can be mapped on a mathematical system. 
An important novelty was Bertalanffy’s emphasis on open systems and steady states instead of 
equilibrium. 

Mathematical systems and structures, however, operate under the axiom of closure, 
which is sometimes regarded as part of the definition of operation. Although renormalization can 
change the base set in an orderly manner, the new set is closed, too. I am not aware of any 
inquest into the possibility of describing an open system in terms of a strictly fonnalized and 
closed one. The concept of and open system represented by a closed one, however, is just one 
possible liability of the traditional systemic approach. The other one, not too often discussed, 



6 


encumbers the very term theory. What is theory? There are theories of philosophical grandeur, 
aspiring the ultimate generality, and there are working theories seeking empirical practicality and 
subject to experiment. 

The additional purpose of this paper follows from my belief that a scientific theory of the 
second type, as it is most widely understood, has to satisfy a criterion of realism, i.e., being 
recognizable by an empiricist as relevant for the appropriate domain of a particular hands-on 
science. Thus, any meta-chemical construct should conform to a chemist’s view of chemical 
phenomena and any meta-theory of history should not be so elevated above the daily craft of 
historians that the names of persons and circumstances of events would be obliterated by 
symbolic formalism. In other words, the theory has to fill up the space between the roof gardens 
of abstract theory and the basement storage of the facts. The contraposition of theory and 
practice is sometimes expressed in terms of a fissure between syntax and semantics. 


This paper borrows important ideas from the manuscript History as Points and Lines by 
the author and Ulf Grenander. The entire project of pattern history was initiated by Ulf 
Grenander in 1994 and most important ideas belong to him. My own contribution is limited to 
the chemical angle of view, use of the concept of transition state, and the idea of an open 
(Heraclitean) formal system, in contrast to the closed (Aristotelian) system (Tarnopolsky, 2003). 

The following short and selective historical overview is intended to help illuminate the 
search for mathematics of history as well as the epistemic place of Pattern Theory in among 
other approaches. 


3. Historical sketch 


Leo Tolstoy filled up the volumes of his War and Peace with hundreds of names, situations, and 
events, recreating the atmosphere of a society in the process of a historical transformation from 
peace to war and back. He did it in terms of observables: name, facial expression, movement, 
utterance, location, thought, etc. He concluded his epic with a chapter of a contrasting nature 
where he outlined his theory of history, using images and metaphors of calculus and chemistry to 
present history and war as what we would call now a dynamical system of statistical physics. It 



7 


could well be described by differential equations as well as intensive (will to fight) and extensive 
(army size) parameters. On Tolstoy’s mathematics, see Vitanyi. 

After Leo Tolstoy, one of the most recent—and gripping—attempts of humanities to 
paint a historical picture with colors borrowed from sciences belongs to philosopher Manuel De 
Landa (1997), who, for example, unites the origin of minerals, emergence of skeletons in 
biological evolution, and the rise of the cities under the same metaphor of mineralization. The 
close relation between metaphor and what is intuitively understood by the term pattern is 
palpable. This line leads back to Michel Foucault (1970) whose thinking was clearly 
mathematical but the language metaphoric and artistic. On metaphor in historiography—and the 
complex picture of the modern philosophy of history—see Ankersmit (1994). 

In 1968, one hundred years after War and Peace, Nicolas Rashevsky published his 
Looking at History through Mathematics (Rashevsky, 1968). He defined the purpose of his book 
as a collection of illustrations “how mathematical reasoning could in principle be used in 
attempted explanations of some historical phenomena.” Among them, imitative behavior and the 
development of aggressiveness, beliefs and prejudices, effect of the shoreline on cultural 
development, role of individual (without mentioning Leo Tolstoy), and some other aspects of 
“long history.” In his earlier “Mathematical Biology of Social Behavior” (Rashevsky, 1959), 
mathematics of history was a topic of an appendix, with such subjects as mathematics of 
political freedom. 

In modem terms, what Rashevsky applied to history could be called population dynamics 
started by Lotka (1956| 1925) and Volterra (1931), who were inspired, by the way, by chemical 
kinetics. Rashevsky focused on the populations of memes, although Richard Dawkins (1989), 
who invented the term, did not mention Rashevsky in his book. For example, “militarism” and 
“pacifism,” “laziness” and “activity,” are competing pairs of behavior reinforced by mutual 
imitation, i.e., replication, and inhibition for the sake of consistency of mentality. The main 
mathematical apparatus applied—and employed also in chemistry and artificial life—is 
differential equations. 

Rashevsky gradually became dissatisfied with the fragmentary descriptions of various 
aspects of LMS, including his own contributions. He came to the conclusion that physics in 
metric space may give a lot of details about life but not the entire view of the living system. He 



8 


formulated a cardinally different idea of “relational” biology and sociology as an opposite of 
the “metric” and fragmentary physical view of the complex systems. For Rashevsky, therefore, 
the term “mathematics” in applications to LMS has two meanings: (1) the down-to-earth 
calculations along a physical model developing as a time series of states and (2) the geometrical 
(in Felix Klein’s sense) view of the system. He believed that the two should better be “objects of 
parallel studies.” 

As an example, a whole array of real structures, from the movement of a bird to 
composing a love sonnet can be mapped on a directional graph of realtions. He specified the 
edge of such a graph as “immediate precedence” or “immediate causation.” Similarly, the 
movement of a paramecia toward food is a relation, which is fully preserved in humans, in spite 
if the evolutionary gap. Rashevsky stated that a complex ( i.e ., LMS) system should be 
described in tenns of patterns defined as families of bijective mappings of one system on another 
(homeomorphisms). In his Mathematical Biophysics , Rashevsky saw “the organism as a set of 
mappings, categories, and equivalences.” (Rashevsky, 1960, Vol.2, p. 390). 


Rashevsky’s abstract ideas were further developed by his student Robert Rosen (1985- 
2000) whose major interests comprised general theory of natural systems, including LMS. In 
particular, Rosen was interested in anticipatory systems where the change of state depends on 
future circumstances, which returns us to the starter example. For such systems a model is a 
must. It was the interaction through an interface between the natural system and model system 
where the solution of the problem of anticipation was expected. Rosen looked at the coding 
through the interface between the abstract model and its real source as a creative act, stating that 
the coding did not belong to the system and its model. 

The mathematical structure he and Rashevsky used was category theory (Mac Lane 
(1971), Rosen, 1985-2). Category is a generalized monoid on a set of objects. The objects of 
category theory can be various mathematical functions and structures. Monoid is a category with 
one object and semigroup is monoid without identity. What unites all these compositional 
structures is associativity and the absence of the axiom of inverse, which alone does not 
mandate irreversibility, however. 



9 


Rosen expected a complex system to be unpredictable, surprising, faulty up to a point, 
and exhibiting novelties. He, however, showed no affinity for thennodynamics of Ilya Prigogine 
who concentrated exactly on the surprise in the behavior of natural systems, but widely used 
realistic physical and chemical models. 

Both Rashevsky and Rosen, following Bertalanffy, understood theory as a mathematical 
structure with irrelevant semantics, self-contained and self-explanatory, which, probably, worked 
to fence off the curiosity of empiric scientists. 


A. C. Ehresmann (mathematician) and J.P. Vanbremeersch (physician; 1987, 1991, 
1999), developing Rosen’s ideas, came to the conclusion that natural complex systems “cannot 
be studied using observables defined on a fixed space of phases and following uniform laws” 
(1999). In their theory based on category theory, the set of elements in the universe is updated 
from state to state. It does it by “absorbing” some external elements and “suppressing” others. 
The efficient cause of an operation is a module called coregulator (CR), which seems to be an 
analogue of the enzyme in Gerhard Mack’s universal dynamics, see below. 

It is possible to formally remain under the axiom of closure because the system consist of 
a hierarchy of levels of complexity. The trade-off is the high complexity of the theory itself that 
has to trace all changes, ultimately, in terms of atoms. The theory, therefore, is, actually a 
representation of total available knowledge. The theory requires an internal memory. This 
requirement, in my opinion, is partly satisfied only in the case of the mind. Neither biosphere nor 
society do not have (yet) a centralized memory. As far as human history is concerned, only 
historians—but not the agents in AI sense—are in collective possession of such memory. 

To compare with Pattern Theory (Section 6), pattern, according to Ehresmann, A.C. and 
Vanbremeersch J.P. (1999) is: 


A pattern in a category K is the data of a graph I and a homomorphism P from I to K. 


Graph I here is the graph of arrows, i.e., associative morphisms interpreted as causal relations. 



10 


The statistical study of wars and conflict risk estimate by Lewis F. Richardson (1960- 
1993), whose studies of shorelines is said to inspire Benoit Mandelbrot’s fractals, is closer in its 
scope to the early works of Nicolas Rashevsky. 

Among Richardson’s predecessors in quantitative history was Pitirim Sorokin (1937), a 
remarkable American sociologist and author of the multi-volume Social and Cultural Dynamics, 
whose life path once crossed with that of Lenin. The audacious step of Sorokin was to quantify a 
repeating historical pattern (such as war, revolt, etc.) not only in terms of extensive frequency but 
also intensive scale. This alone creates preconditions for hard science. 

Richardson, whose main works were published posthumously, left a fruitful and lasting 
heritage, named after him research institutions, his works collected and re-published. His effort 
on collecting databases is continued on a large scale by Peter Brecke (Long, 2003) who is in 
charge of the project pursuing taxonomy of violent conflicts and identification of warning signs 
of conflicts by methods of pattern recognition at Sam Nunn School of International Affairs, 
Georgia Institute of Technology. This project might finally abate the historically well founded 
pessimism about the ability of humankind to draw lessons from its history. Clio has now a 
balance sheet. Notes on the methods of quantification of security are on the Web (Brecke, 2002). 
About the entire project, see Brecke (WWW). 

There is an impressive literature on Nonlinearity, Complexity, Combat & War (War, 
WWW). 

Among Richardson’s results was the conclusion that military conflicts follow Poisson 
distribution. To generalize, historical events are rare and too many of them at the same time are 
improbable. This resonates well with the chemical paradigm where one or two, rarefy three 
molecules are involved into a “chemical event.” Both history and chemistry are local. 


History, philosophy, and art—the realm of semantics—have been among life long 
interests of Ilya Prigogine, a Renaissance man and the founder of non-equilibrium 
thennodynamics and, arguably, the entire modem science of complexity. He repeatedly 
approached the problem of human history in view of the behavior of the complex non¬ 
equilibrium systems he was the first to study hands-on (Nicolis and Prigogine, 1989, Prigogine 
and Stengers, 1984). He has been especially interested in the problem of time and its 



11 


irreversibility in history, in contrasts with classical physics and contrary to the Einstein’s dictum 
"For us believing physicists, the distinction between past, present, and future is only an illusion, 
however persistent." (Einstein-Besso, 1972). 

Prigogine’s view of history has two major components: 

1. Prigogine united many natural systems, which develop complexity and order, under the 
notion of dissipative structures. The eddies in a fast flowing liquid is an example. If there is a 
constant supply of work and its dissipation in the form of heat, the flow of energy keeps the 
system far from equilibrium and maintains order. A dissipative system can switch to different 
modes and manifest a coherence and order impossible at equilibrium. He suggested that life and 
society may belong to this class of systems. 

2. Prigogine suggested that the irreversibility of historical time is a consequence of the 
series of bifurcations and symmetry breaking in the evolution of dissipative structures. 

According to Prigogine, it is the sequence of random choices that makes history irreversible. In 
terms of dynamical systems, it means that the phase trajectories of a dissipative system are 
diverging: such system does not possess Lyapunov stability. Prigogine compared its behavior 
with kneading the dough (baker’s transformation). Slightly modifying his example, two raisins 
put in the dough side by side, will most probably diverge during the kneading. Note that the 
concept of stability of dynamic systems does not even involve thermodynamics with its energy, 
entropy, and temperature that need to be defined on social systems. 

Prigogine has not applied his theory to history in any systematic way, yet some far- 
reaching conclusions follow. The dissipative system stays far from equilibrium while the 
dissipation of energy lasts. When it comes to an end or the system itself breaks down, it returns 
to an equilibrium or stays very close to it. Therefore, an LMS system, for example, an industrial 
civilization, cannot maintain its structure if either the supply of energy comes to an end or, which 
is sometimes overlooked, the dissipation becomes hindered because of the rising ambient 
temperature. The only alternative is a deep restructuring of the system. Dissipative system, as 
any thermal machine, is positioned between supply of work (order) and disposal of heat (chaos). 

All such reasoning, however, as any doomsday scenario, is of little immediate concern 
for most people because the internal time of non-equilibrium thermodynamics is not in any way 
connected to the calendar time. The present pattern of industrial civilization emerged after 1700, 



12 


when mineral energy had become involved into the process of dissipation by technology. 

Nobody knows when the global baker will apply his hands next time, but thermodynamics warns 
us where to look for possible trouble. It suggests a gauge for vital signs of a civilization jammed 
between mineral fuel and atmospheric temperature. 


The concept of relational theory, with direct references to Rosen and further back to 
Rashevsky and, in most recent publications, to Ehresmann and Vanbremeersch, but without 
mentioning Pattern Theory, was expanded to a universal dynamics theory by physicist Gerhard 
Mack (1995 - 2001) who is looking for a “a universal chemistry in which general objects and 
li nk s substitute for atoms and their chemical bonds,” starting with a “preaxiom: The human mind 
thinks about relations between things or agents.” The inception of the project was influenced by 
philosophy and Wittgenstein in particular. The further development of this project is planned to 
take a form of computation. 

Mack’s universal dynamics is relevant for the topic of history for two reasons. First, it 
covers not just a system but also a history of a system, which is itself a system called drama. 
Second, it advances a very fundamental idea of locality (first implemented by Maxwell’s 
dynamics instead of Newton’s action at a distance) in the following axiom where arrow is a term 
for morphism in category theory: 


4. locality: Some of the arrows are declared direct (or fundamental); they are called links. All 
arrows f can be made from links by composition and adjunction, f = b n o ... o fr; (n > 0) where b; 
are links or adjoints [reversed arrows] of links; the empty product (n = 0) represents the identity. 
(Mack, 2001) 


The elementary act of change in the system, therefore, happens only in the topological 1- 
neighborhood of an object, where a composite link is made fundamental or a fundamental link 
is made virtual. When the last fundamental link is gone, the object dies. 



13 



Figure 1: “Catalysis in chemistry and elsewhere. A catalyst C binds molecules A and 
B. First a substrate-enzyme complex is built, where A and B bind to C. Next the 
composite arrow from A to B becomes fundamental” (the Figure and caption are from 
Mack, 2001). 


The atoms of a minimal change of the system are transformations of the graph of 
relations in the 1-neighborhood of a node: the edge can be either eliminated or established and 
the node can be duplicated. Mack distinguishes between the following four transformations, of 
which the first is the atomic one. 

1. motion: “promoting” a composite arrow to the status of a direct one, i.e. link; 

2. growth: by making a copy through a series of motions and duplications of nodes; 

3. death: the only irreversible transformation, consisting of removing all the links; 

4. cognition: approximately can be described as recognition; it establishes li nk s between 

two non-atomic objects, which are, typically, isomorphic; cognition creates new arrows. 

The closest to cognition phenomenon is enzyme-substrate interaction. The idea of the 
move is illustrated by Gerhard Mack with the example of generalized catalysis, Figure 1. 


In universal dynamics the local transformation is performed by enzyme. It is not exactly 
what catalysis means in chemistry, but rather what command means in programming and 
coregulator in the system of Ehresmann and Vanbremeersch (1999): the efficient cause of a 
local change in the state of the system, i.e., a demon of a kind. 

On the surface, the system under universal dynamics looks like a transformation of a 
connected graph that preserves its connectedness and occurs as a sequence of atomic changes 
one edge at a time. I do not see, however, how any new object can appear in the drama (which 




14 


can be stochastic) and the notion of novelty does not surface in a mathematical structure under 
closure. 

The question that arises when a compositional mathematical structures without a unique 
inverse is applied to a real system is: what if two causation arrows contradict each other? A 
similar question: what if two diverging arrows cause incompatible events? These questions do 
not apply to always “self-consistent” physical systems but are typical for LMS. I do not know the 
answers, although Ehresmann and Vanbremeersch (1999) allow in principle such contradiction 
of arrows. 

For details one should better look into the sources. This project goes far beyond the initial 
relational concept of Rashevsky and Rosen and touches upon the deepest foundations of 
sciences. It has been evolving and some aspects lack details while others, such as theoretical 
physics, are beyond my competence. 

A small set of minimal transformations can be found also in the relatively recent 
“Evolving Algebras” of Yuri Gurevich (1995) where the arity (number of arguments) of the 
transition function can be >1. Convergent or divergent incompatible outcomes would make 
computation impossible, but in political life, the right hand often does not know what the left 
hand is doing. 

The idea of 16 archetypal morphologies, i.e., basic transformations of realistic structures, 
among them—adding or removing an edge or a node of a graph—goes back to Rene Thom 
(1973) who saw it as beginning, ending, fastening and cutting. Thom’s ideas, unjustly trampled, 
(as Thom believed, because they did not deliver a quantitative prediction) , may be a key to the 
radical simplification of social complexity. Thom’s stirring , which looks as a bell-shaped 
curve, can be identified with the transition state (see Section 6). the concept of catastrophe itself 
is what happens between the initial and final states. 

From the chemical perspective, a set of two operations is sufficient to build a structure: 
form and delete a bond. In Pattern Theory ( see the corresponding section), a node (generator) 
can be added and deleted. The confusing abovementioned situations in both chemistry and 
Pattern Theory are irregular , but natural: irregularity is just a degree of regularity, like chaos is 
a degree of order an vice versa. 



15 


Artificial intelligence used to ignore individual history or story (Schank and Abelson, 
1995) of a robot or computer. The role and possible use of story in complex systems has been 
insightfully analyzed by Chrystopher Nehaniv and Kerstin Dauterhahn (Nehaniv 1998-1 and 
-2) and summarized as: “Being alive means having a lifetime, without a reset button.” (Nehaniv 
and Dauterhahn, 1998-2) Nehaniv at al connected story (ontogenesis in biology) with the 
thennodynamic irreversibility of life and assigned to semigroups the label of “algebras of time.” 
Nehaniv at al fonnalized the unique direction of ordering in semigroups and included “pools of 
reversibility” into the general semigroup structure that does not require irreversibility 
axiomatically. Nehaniv and Dauterhahn (1998-2) explicitly stated the problem of the limits that 
a closed mathematical structure imposes on the realism of representation but pointed to a 
“dynamically changing” what the system considers as its inputs and which are relevant as a 
possible direction of solving the problem. 

Thermodynamics, which establishes a preferred direction of time, can hardly be 
classified as either syntax or semantics. It is external to timeless mathematical structures and 
finds no place there. The relation theories expurgate metrics together with semantics and find no 
place for thennodynamics. An intriguing case of thennodynamics on a formal but tainted with 
realism system is reversible computation , Rolf Landauer (1961), Charles H. Bennett (1973) . 
It arose from the problem of heat generation by the computers that have to erase all intennediate 
data and clear the registers after each computation and therefore, dissipate heat without creating 
information. See also Section 6 in Bremermann (1974) . 

The information produced during the individual history of a finite state automaton, such 
as, for example, Turing machine, is discarded or, more generally, the mapping of a state back 
onto the previous state of a computation cycle is not bijective 

The problem “how to save energy”, therefore, resonates with “how to remember the 
lesson of history and survive the next election,” in political language, as an alternative to starting 
the second campaign from scratch. 

The reversible computing works on a new task by 

1. saving all intermediate results, 

2. generating the output, 



16 


3. retracing the computation back to the input, which is possible because of the saved 
history of the computation, while reversibly erasing only the history. While reversing the 
computation, a Turing machine that represents the finite automaton has to change the read-shift- 
write quintuple 

<AT^T'gA'> 

where A, A' refer to states, T, T to the tape, and a is the movement of the head 
to the write-shift-read quintuple. 

The overall result is the decrease in the dissipation of energy, which is still higher than in 
the direct computation alone but less than in the entire cycle of irreversible computation. 


Motivated by the goal of bridging the gap between history and sociology. Bertrand M. 
Roehner (physicist) and Tony Syme (economist and sociologist) formulate their subject in 
Pattern and Repertoire in History as analytical history (Roehner, 2002). In their most 
innovative, wide in scope, and rich of facts work they follow the agenda of Charles Tilly (1981, 
1984) who—very much in line with one of the stimuli that drive chemists and mathematicians— 
wanted to know whether the repertoire of elementary blocks in history really existed. Tilly, the 
student of Pitirim Sorokin, made clear that the binary relations between the blocks were of the 
primary interest for a historian-sociologist. 

It may seem that, since history as a whole is a unique sequence of events, developing at 
specific times and locations, it can be only described as a whole but not studied by the scientific 
method. Natural sciences, since Aristotle and through Descartes, follow the method of analyzing 
a complex phenomenon and dividing it into simpler units. A closer look at history reveals that it 
is built of standard modules which repeat across time and space and can be compared and 
generalized in the form of patterns. The perception of wars, revolutions, and reforms as standard 
blocks of history is by no means new. The authors discover, however, that the smaller building 
blocks of the large events are also to some extent standard and enclosed in each other in fractal 
manner. Moreover, the smaller blocks are highly repetitive throughout time and geography. 



17 


Roehner and Syme note that real physical processes, like the warming of a cup of cold 
drink, are by no means simple. Only by splitting them into simpler problems physics managed to 
successfully explain them. They refer to simplicity as “a basic requirements of the human mind.” 

The main methodological idea here is to look for patterns or regularities in history at a 
microhistorical level, largely regardless of their macrohistorical environment. The obtained 
knowledge can be further used for explanations at the macrohistorical level. 


To give an example, the apparently unique meetings of the French Estates-General are 
elements of a larger set that includes also the meetings of the parliaments of Gennan and other 
European states of similar historical periods, while the French Revolution itself is an element of 
the set of European revolutions, successful, as well as abortive. The sale of Church property 
occurred also twice in England and once in Austria at completely different circumstances. 

The comparative method, with its long jumps, lies beyond typical historiography, but the 
description of modules follows the historiographic tradition of a consistent narrative. To 
completely appreciate the wealth of material, the book must be read in its entirety. Many 
episodes of world history reviewed in it can hardly be found in general courses. Authors’ 
analysis of difficulties and perspectives, as well as the task of a computerized Very Large 
Chronicle project is marked with the enthusiastic practicality typical for large scientific projects, 
although to recall the Manhattan Project would be politically incorrect. 





Figure 2. Monarchy, nobility, and clergy in three kingdoms, along 
Roehner, 2002, Fig. 2.1, p. 76. M = Monarchy, N = Nobility, C = Clergy 




18 


Any modem historical narrative can be illustrated by graphs, diagrams, and tables. In one 
case we can see a different type of a graph, see Figure 2 . It presents the relative strength of the 
relationship as the number of connecting lines. The concentric circles in the triangle of Britain 
reflect the fuzzy borders of nobility. Apparently, the weight of the influence can also be denoted 
by the area or appearance of the circle. 



Figure 3. The structure of visits at Duisburg Zoo, from Krempel 1. Source: http://www.mpi- 
fg-koeln.mpg.de/~lk/netvis/zoo 1 .html 


Analytical history naturally conjoins with non-equilibrium dissipative structures because, 
as Prigogine noted, complexity is what needs more words to describe it. Extending his reasoning, 
these necessary words should be taken from a dictionary different from that of simplicity. As 
“hard” scientists, Prigogine and his colleagues can always describe not just the wonder of 






19 


organized behavior but its detailed mechanism for any dissipative structure, especially, of a 
chemical nature. Analytical history provides us with both vocabulary and structural elements 
from which the historical complexity is built and the authors are looking toward the syntax 
’’grammar” of it all. 


To what extent a social structural chemistry can be quantified, the achievements of the 
fast developing trend of visualizing social networks ( Lothar Krempel) show. This approach is 
strongly metrical. A number is associated with both nodes and edges of the graph, and all nodes 
are labeled with unambiguous semantics. 

Two examples will do here. 


Figure 3 presents the structure of visits at Duisburg Zoo (Gennany). The size of the 
spheres corresponds to the number of visits and the width of the connecting lines gives the 
traffic. For example, the largest sphere in the bottom left hand corner is Main Entrance, from 
which one can go NW to Horses and from them to Kangaroos or NE to Parrots. 

Obviously, we deal here with a very general representation. This picture can be 
transformed into an endless variety of configurations filling up a topological space with 
Hamming metrics. The neighborhood of a configuration includes all configurations with a single 
changed bond. This closed structure can be opened by expanding the neighborhood to new 
nodes, while their elimination would ensue after removing all li nk s. The Duisburg Zoo must 
have had its own history in which sections were added, removed, and added again. This history 
would also reflect changes in traffic and number of visits. If normalized, the numerical values 
can be expressed as probabilities. 

The second illustration, Figure 4, shows two snapshots of the dynamics of automobile 
trade 1980 and 1994. These configurations are already very close to those of Roehner and Syme. 

A stunning variety of visualizations can be found in: Dodge (2001). 


To conclude the review, the entire area of structural representation in humanities owes a 
lot to Claude Levi-Strauss and structuralism (Lane, 1970). It seems that the initial wave of 



structuralism could not sustain itself in the cannibalistically competitive atmosphere of 
humanities with its generation wars and post-ism waves, but he is unquestionably among the 
founding fathers of the future consensus. 


20 



Figure 4. Dynamics of world automobile trade in 1980 (left) and 1994 (right). From 
Krempel 1. Source http://www.mpi-fg-koeln.mpg.de/~lk/netvis/global/autochap3.html 


In this review I have left out a large area of mathematical sociology, already sufficiently 
naturalized. For the entrance to the area, see ASA (WWW). The volume of works in 2- 
neighborhood (and farther away) of “hard” history is enormous. 


4. History and thermodynamics 


Neither history, nor thermodynamics, nor, especially, mathematics belong to my 
professional area. The following is just an informal look from aside, aiming at some distinctive 
properties of Pattern Theory. 

Reversible computing, although turned impractical, opened an entire and still unexplored 
area of using history for applied problems. Bennett’s paper (1973) is interesting also from 
another angle. It ends with a comparison of computation with mechanisms of molecular biology 
such as the template synthesis of messenger RNA and its reversible degradation. Bennett notes 











21 


that although this process is also somewhat wasteful, it is much more efficient than irreversible 
computation. Bennett’s description of the role and thermodynamic mechanism of catalysis 
stands alone with its high chemical accuracy. 

What is known about enzymatic action fits well the framework of organic chemistry. 
Enzymes are proteins folded into a particular shape, which presents a real mystery. They easily 
solve the problem of their own folding into whimsical but reproducible shapes, which for 
humans is one of the hardest computational problems requiring brute force. This representational 
gap—it should not be hard for us what is easy for nature—may mean that it is not the final shape 
that is coded in their primary structure, but the reversible history of folding. 

To illustrate the relation between history and thennodynamics, let us invoke again 
Maxwell’s Demon, whose behavior can be mapped on the behavior of a human being. A 
biologist’s argument against Demon can be as follows. Certainly, Demon can perfonn his job 
while he is alive. He would need energy, however, to process the visual infonnation, send the 
efferent signals to his hands, and work against the door spring or, if there is no spring, against the 
inertia of the door. Therefore, the demon is possible, modulo his energy supply. He will have a 
history with a beginning and the end. This argument—only half-unserious—implies that in order 
to have a history, a system needs to dissipate energy, whether it consumes it or not. This 
argument does not even mention entropy. 

In a mathematical system, the act of combination of some elements of a set into a subset, 
as well as any operation and mapping, does not require any energy, which is beyond the concept 
of mathematical structure. But why? Nothing in the nature of mathematics forbids us to assign 
metrics to the operation of combination. In reality—physical, chemical, biological, and social— 
two objects combine because either they have attraction to each other (i.e., the combined state 
has a lower energy than a separated one) or there is a source of energy to keep them together in 
spite of their mutual repulsion, or there is a constraint on separation. 

Thermodynamics is absent from the syntactic representations of complex systems. 

Neither does thennodynamics belong to semantics. Thermodynamics follows from the idea that 
to each binary mathematical operation on the real system conesponds a change of some value 
(energy, free energy, attraction, stress, strain, etc.), which ananges such operations in at least a 
partially ordered set. 



22 


Coming back to the problems with the mathematical structures under closure, noted by 
Nehaniv and Dauterhahn (1998-2), the situation in a mathematical system is the same as with 
Prigogine’s system where the phase trajectories not only diverge but also converge. The 
transition functions in a representation of the system may not have a single inverse and they 
form a semigroup, non-restrictive to reversibility. I tend to believe that an open mathematical 
structure for real systems should include “Prigogine axiom” which—scratching the ear of a 
mathematician—would sound something like: 


There is a distance d = n between the elements of a semigroup expansion (product of all 
partial histories, i.e., total history without repetitions) such as if d > n, the expansion has 
no inverse. 


Irreversibility in LMS does not follow from thermodynamics alone. It follows from an 
open character of the system in which infonnation is erased and lost. For such a system to exist 
without essential loss of complexity, an acquisition of new infonnation is necessary. 

It is easy to formalize loss, but how to fonnalize novelty? Transformation is function/on 

set X: 

f:X^X 

This leaves no place for either expansion or contraction of setX. Transformation f 0 on 
an open set would mean: 

U (X M , AX t » ), 

where AAj+i is a change of the universe of the mathematical system. 

To abandon the axiom of closure, however, would mean a profanation of the entire 
mathematics. Bourbaki, probably, realizing that, ended his Set Theory (Bourbaki, 1968) with a 
strange construct: the set which is the mathematical image of emergence: a combination of 
elements of the basic set becomes a new element in an expanding and partially ordered set: the 
scale of sets. 



23 


In a very general sense, there is an obvious but rather metaphoric relation between gauge 
theories in physics (Moriyasu, 1983) and any theory invariant regarding the change of the base 
set. It seems that Pattern Theory is the only one developed theory of this kind. Probably, only a 
very abstract mathematical analysis can answer the question whether the world from physics to 
poetry is essentially uniform. From the point of view of PT, this is certainly so. Intuitively, it 
seems that Robert Rosen was right and simple systems are rare exceptions in the curved evolving 
universe. It is interesting to see Gerhard Mack’s progress. 

The evolving scale of sets with forgetting and loss of elements, balanced with an inherent 
expansion is, in my opinion, a basis for an open associative mathematical structure that does not 
contradict the axiom of closure because the basic set, which for real LMS systems is the alphabet 
of a language or the elementary percepts (cells of the retina, tactile receptors, etc.), remains 
constant at least in the medium run. The function of the language is to create a sufficiently large 
combinatorial space for a limited size of the combination. Taking an average length of a word as 
5 letters and alphabet of 25 letters, the abstract word space contains 5 " words. Only a tiny part of 
this space is occupied by existing words, as a tiny part of the display pixel space is occupied by 
meaningful images. This property of an extremely sparse use of combinatorial space is an 
intrinsic property of LMS. 

Instead of describing LMS in thermodynamic terms as systems far from equilibrium, we 
can describe them as systems that always occupy only a very small part of all available 
combinatorial space. Complex systems, therefore, are simpler than they could theoretically be. 

Although any concept and even a word of the vocabulary has its own history, this history 
is usually forgotten. For example, the words war, and chemical bond have completely lost their 
histories, while the word rose is still ties its meaning to some sensory perceptions. In this way, a 
knowledge representation becomes an open system. 


5. Pattern Theory 


A mathematical theory that can be recognized by a chemist as meta-chemistry was first 
introduced in the 1960s, around the same time as “categorical” theory. Pattern Theory of Ulf 



24 


Grenander (1976-2003) assigns metrics and structure to the naked operation of composition. It 
incorporates not only atomism and Klein’s geometry (transformational invariance) but also 
opens the door to thermodynamics. Moreover, it opens to openness itself. To an eye of a chemist, 
it does it by realistically implementing atomism and the principle of locality. 


Constructs of atom-like elements connected in a certain order are the original subject of 
chemistry, not borrowed from physics. Both chemistry and sociology, as well as a large number 
of other disciplines, regardless of whether they are of human or inanimate origin, describe 
objects of various nature and complexity in tenns of their structural elements: building blocks 
and bonds between them. Pattern Theory is the mathematics—and universal chemistry—of such 
objects. 

The papers and books on relational LMS are conspicuously devoid of realistic examples. 
On the contrary, the first impression one might get from a book on Pattern Theory (Grenander , 
1995) is the cornucopia of illustrations as different as skeletons and languages, fairy tales and 
stomach shapes, mitohondrias and clustering of human settlements, with a potato on the cover 
and two color plates of battlefield paintings inside. 

Pattern Theory, when first introduced, was based on four principles: 

(1) atomism of building blocks (generators) possessing a selective ability to (2) 
combine with each other according to rules of combination, which leads to regular 
configurations that are (3) observable as images. The fourth principle is (4) realism: 


A theory of patterns that would not take into account how actual patterns behave would be 
severely limited in its applications. We must, therefore, assure that the theory is realistic so that it 
can deal with real patterns (Grenander, 1976, p.3). 


This is the kind of realism an empirical scientist would expect from a theory. Let us note 
that by building blocks Ulf Grenander understood also “abstract symbols, sets, relations, or 
functions,” which puts it at least at the same level of generality as category theory. 



25 


NOTE ON ATOMISM. Atomism is traditionally attributed to Democritus and 
Lucretius. Regarding the philosophical roots of modem mathematical atomism, it seems 
unfair to omit Leibniz with his monadology: “ 

“3. These monads are the true atoms of nature, and, in fact, the elements of things. 

8. Whatever is in a composite can come into it only through its simple elements and the 
monads, if they were without qualities (since they do not differ at all in quantity) would be 
indistinguishable one from another (Leibniz, 1973). 

Remarkably, as if anticipating the universal dynamics, Leibniz, an antipode of 
Newton, explained the interaction at distance by the composition of local interactions. 


The atomistic principles, based on the variety of qualities of the atoms, underline Pattern 
Theory, in which (informally): 

1. Primary atomic elements (generators) have labeled bonds (bond structure), similarly to 
atoms in chemistry. 

2. Generators with similarly labeled bonds form bond couples, thus combining into 
configurations, similarly to molecules in chemistry. In Grenander’s language, the generators 

interact. 

3. Both generators and their binary composits (bond couples between them) can be 
attributed a rich variety of properties, including equivalence classes and numerical values, such 
as, for example, the acceptor function that defines the selective affinity of generators toward each 
other. Pattern system can be fine tuned. Analysis and synthesis of patterns relies on heuristics as 
much as on formalism. 

4. In a closed system, probabilities of configurations are detennined by probabilities of 
generators and bond couples, similarly to chemical equilibrium and chemical additivity. 

5. Observable images are introduced as equivalence classes in configuration space, 
similarly to classes and conformations in chemistry; 

6. Regularity R is defined on a set of configurations as quartet R — (G, S , p, £ ), 


where 




26 


G is generator space, S similarity transformation defining equivalence classes in 
generator, configuration, and image spaces, ,p is bond value relation for coupling bonds, and 
X is type of connector graph, which, in essence, is also defined trough local properties. 

Regularity in chemistry is called stability : the property of a chemical structure to be 
isolated and stored for significant time. In both PT and chemistry irregularity means higher 
energy as compare with the regular structures. 

7. Template is a representative member of an equivalence class of regular configurations: 
other regular configurations can be obtained from it by group of similarity transfonnations. Thus, 
under a particular (rather loose) regularity R, an indefinite number of sociological and 
economical structures can be obtained by transfonning Figure 3 modulo R . Given the Periodic 
Table and regularity of chemistry, the entire chemistry can be built, bond by bond, from any 
compound and elements, which is, by the way, not only is feasible in the lab but was the actual 
origin of biochemistry on the Earth. The problem becomes much more complicated in sensory, 
and especially, visual images. 


The realistic character of Pattern Theory makes it highly recognizable as a good “meta¬ 
chemistry” for a chemist. I have already discussed the chemical realism of PT (Tamopolsky, 
2003). Moreover, although PT is algebraic in the treatment of groups of transformations that 
identify images as equivalence classes of configurations, the set of generators may be open. 

This idea was expressed by Ulf Grenander in his first large pattern system MIND (Grenanser, 
2003, p.7): 

The set of all generators available to a particular mind will be denoted by G ,the generator space . 
As time goes on G may change: new primitives may be acquired, others forgotten, but for the 
moment we shall treat the generator space as fixed. 


This seems like a radical idea for a mathematical system, and, for that matter, even for 
the systems theory. MIND contains other radical ideas, among them, that only a small part of the 
generator space (content) is involved into the ongoing transfonnation of the mind and, moreover, 



27 


the content is drifting in a regular (in the sense of PT) manner through the configuration space. 

It means, in my interpretation, that the mind, unlike the computer (and unlike a robot in AI), has 
no random access memory: the most probably accessible memory is the 1-neighborhood of the 
generators in the content. The depth of memory can also be accessed, but at lower probability, as 
a “sudden revelation” or “creative act.” The next state of the content is fonned from the 
generators connected to the previous one. This prevents the degeneration of configurations to 
negligible probabilities. The mind is a device for self-discipline. 

I would interpret Grenander’s MIND as not a point but a cloud of probability moving 
through the phase space of the system. In the present version of MIND the model is, essentially, 
ergodic, which an evolving LMS system is not: the configuration space changes through inputs, 
forgetting, and the steps of the Bourbaki’s scale. Of course, ergodicity makes no sense 
whatsoever for very large combinatorial spaces with times of search exceeding the age of the 
universe. For example, the pixels of a digital image generate over 2 1000000 combinations. 


NOTE: To make a pattern-theoretical equivalent of an open system, one only needs to 
establish a competition for a limited resource between configurations. Loss of 
information is an equivalent of dissipation, while acquisition of information can be 
arranged through the Boolean function of novelty (Tamopolsky, 2003). 


The model of MIND reproduces the important property of LMS: the change is local. It 
means that changes in all sufficiently large system occur in limited areas. To put it differently, 
significant events in LMS, like earthquakes, are expected to follow Poisson distribution and a 
large number of simultaneous changes is improbable. What was confirmed by Lewis F. 
Richardson for wars may be true also for Washington scandals. 

Another radical idea of Pattern Theory seems to be directly opposite to the exclusively 
relational approach. Both generators and relations between them (bond couples) can be 
associated with labels and numbers, for example, bond values, probabilities, or energies, as well 
as global temperature, which makes Pattern Theory a kind of intrinsically thermodynamical 
mathematical system, deeply analogous to chemistry. Given the regularity, it invisibly 
“calculates” its own probability distribution over the configuration space in the same manner the 



28 


chemical system “calculates” its equilibrium or proteins “calculate” their folding. Monte Carlo 
methods are used to compute this equilibrium, which otherwise would require as much 
computation as protein folding. 

The best way to Pattern Theory, with its unique combination of abstraction and realism, 
is to turn to the original sources. My immediate goal is to see what it can do for history. 


The pathway from Pattern Theory to history goes through chemistry, which has been a 
metaphor for LMS, starting from Leo Tolstoy who regarded big historical movements of people 
and ideas as fennentation. 


6. Chemical kinetics: the short run 


I discussed the relation between PT and chemistry elsewhere (Tamopolsky, 2003). Here I will 
focus on the concept of transition state from the meta-chemical perspective, i.e., in terms of 
generators and configurations of PT, which significantly idealizes the real chemical situation. 

The concept of transition state (Eyring and Polanyi,1931) is a very general concept of 
dynamics. 

Transition state theory (TST), introduced by Eyring and Polanyi ... in 1931 as an early attempt to 
determine absolute reaction rates, is too often considered the domain of the chemist or chemical 
physicist. However, the transition state (TS) is actually a general property of dynamical systems 
which involve an evolution from “reactants” to “products.” Such processes include, but are by no 
means limited to, the ionization of atoms, the dissociation or re-action of molecules, and even the 
escape of an asteroid from its orbit (Jaffe et al, 2000). 

The transition state theory is the central explanatory paradigm of chemical kinetics. It 
assumes that between configurations A and B on the same generator space the position of 
equilibrium is defined by: 



[A]/[B] = Ke ; [A] + [B] = C; 
log K e = - (G B - Ga) / kT , 


29 


where [A] and [B] are probabilities (or concentrations, in chemical terms), C = 1 for 
probabilities, C is total concentration in chemistry, K e is equilibrium constant, and Gb and Ga are 
energies of the two configurations, additive over generators and bonds. They are Gibbs energies 
AG = AQ- TAS in chemistry, where Q is thermal energy, T is temperature, and S is entropy. 

In such a simple system, energy and probability are just two scales—linear and 
logarithmic—to measure the same parameter of distribution. In real chemical systems, absolute 
values of energy are rarely observable or calculable and in real complex stochastic systems, the 
event space is never complete. For all practical reasons, chemists are satisfied with their 
differences AG, often even with the sign of AG. 

If configurations A and B are formed from only two generators U and V, the 
transformation consists of bonding and unbonding of the generators: 



No other configurations are possible and nothing else can happen. 

On rich generator spaces, combinatorial configuration spaces, under the same regularity, 
can be indefinitely large. In this case configurations can be in equilibrium with indefinite 
number of other configurations (Figure 5). 

The chemist tacitly accepts this, but on one condition: the time of equilibration is 
infinitely large. If the time of equilibration were short, explosive chemical complexity would be 
unmanageable. 

In fact, only a small fraction of all possible transfonnations runs with any measurable 
speed. The chemists possesses a large (but not exorbitantly large) collection of heuristics, which 
allow for limiting the alternatives, ideally, to one, two, or otherwise accept the uncertainty, as in 
radiation damage of biopolymers. The chemist simply knows that some bonds, actually, most of 
them, will not change in a spontaneous movement toward equilibrium. Only the fastest 
transformations can be taken to account. 



30 


Suppose, we have two concurrent transfonnations: 


A + B —» C + D 
A + B —> F + G 

Since the rate of transformation AB —> CD is proportional to the product of the 
probabilities (concentrations) of the configurations on the left, the fastest transformation will 
quickly exhaust A and B and further slow down AB —» FG. The two transformations compete 
for a limited resource. Nevertheless, if both are reversible, the final result will be equilibrium. By 
skillfully using what is called thermodynamic (equilibrium) and kinetic (transformation speed) 
control, and, especially, catalysis, the chemist rides the professional bicycle that falls down when 
it stops. The living cell is no different: it must run its biochemical cycles in order to stay on the 
road. 



Initial Transition Final 


Figure 5 The tree of transformations in chemistry 


The key to managing chemical complexity and staying away from combinatorial 
explosions is in the reaction rate, which for AB —» CD is expressed as: 


d[C]/dt = d[D]/dt = K,[A][B], where K ; . is the rate constant. 



31 


To explain why rate constant vary and transformation takes some time, the chemist 
assumes that between A and B on one side of the transformation and C and D on the other side, 

all four being regular, there is an irregular —and therefore unstable, ephemeral, and of low 

• • , •• ^ . . . . 
probability/concentration—transition state ABCD , which is neither AB nor CD. It is much less 

probable and, therefore, of higher energy than AB and CD, so that AB and CD are in equilibrium 

with ABCD*. The transition state, therefore, is the bottleneck of real transformation. 

It is easy to see, that this conjecture introduces time into equilibrium thermodynamics 

where it does not belong. Justifications for this sleight of hand can be found in quantum physics. 

In Figure 6 transformation A —» B ( Ga > Gb ) goes through transition state AB . 
The reverse transformation B —» A brings the system, sooner or later, to the equilibrium. 

The appearance of the curve is entirely fictional because it is not observable. Moreover, in most 

* 

cases, AB is not observable either: the entire curve is passed very fast. The bottleneck 

* 

is caused by the unfavorable position of the equilibrium between A and AB . 

In social systems, it is the very rarity of a “big” transformation, as the rarity of a 
formative event in Freudian psychology, that makes it irregular. 



Figure 6 Transition state AB between stable states A and B 



32 


The fact that the transition state is typically not observable follows from its instability; 
the instability (short life span, low concentration) follows from its high—as compared with 
stable initial and final states—energy; the high energy follows from its irregularity; irregularity is 
most often the result of a loose uncoupled bond or another internal stress, for example, because 
of geometrical deformation. The italicized words are used in similar meanings in social 
psychology (stress) and PT (deformation), not to mention material engineering. What follows 
from this causal chain is the notion of regularity and stability as the vehicle of analogy across 
long distances in knowledge representation. 

The sequence of stages, including hypothetical or sometimes observable (for example, by 
color or magnetic properties) transition states is called mechanism. 

Example: transformation A—B + C—D —> A—C + B—D can go through two 
mechanisms: 

1 . 

. * * 

A—B —> A + B ; 

A* + C—D -> [A. C . D]* -> A—C + D*; 

D* + B* -> D—B 

2. A—B -> A* + B*; 

sfe 5k 5k :fc 

C—D -> C + D ; A + C ^ A—C; 

* * 

D + B -» D—B 

In the above mechanisms, all configurations with asterisk, i.e., with missing or extra 

3jC 

bonds, are unstable. In [A.C.D] the unusual (irregular) dotted line symbolizes irregular 

bonds. It can be compared with a love triangle, while A , B , etc., represent the singles eager 
to find a partner. Let us have in mind, however, that what is irregular in chemistry is perfectly 
regular in quantum chemistry. 






33 


To summarize, in pattern kinetics, irregularity is the key to the probability of a certain 
direction of change. 


7. Equilibrium, kinetics, and catalysis in history 


As applied to history, the hypothesis is: if two or more alternatives are available, as in Figure 5, 
what can go through the least irregular transition state, i.e., with the lowest energy, relatively to 
the initial state, has the highest chances to happen indeed. Of course, the word “probability” is 
used here metaphorically, as we use the words “probability of rain” talking about weather. It 
would be more appropriate to use the vague “chances.” 

In the long run, however, the outcome is driven by the difference in energy of the initial 
and the final state. Thus, in the full of pattern trajectories traceable through 2500 year distance in 
time “The Peloponnesian War ” Donald Kagan (2003) comes to the conclusion: “In a war of 
attrition the side that does the most damage must ultimately win.” (p. 75). He obliquely suggests 
that if the Athens were more offensive against Sparta in the very beginning of the war, instead of 
taking a defensive position, the outcome could be more favorable for it. In fact, a period of a 
more active military policy in the middle of the war brought the Athens significant success, lost 
later. 

Similarly, Paul Kennedy believes that the outcome of WWII was predetermined by the 
economic and numerical superiority of the anti-Nazi allies who, I would rephrase, could inflict 
more damage. In the short run, the Nazis seemed unstoppable. But the history of the twentieth 
century is full of conflicts where the superior power could not win. 

The final state of a historical conflict, therefore, can be evaluated more or less 
objectively. However, quoting Kagan, “But reason rarely predominates when states and their 
people have gone to war, and objective calculations of comparative resources are rarely enough 
to predict the course of an extended conflict.” (p. 63). People are driven by “fear, honor, and 
interest.” Besides, as Kagan noted, true democracy can be a big “inconvenience” for a country at 



34 


war (p.87). In the first American war of the twenty-first century, the unwillingness of the 
democratic societies to inflict damage on anybody, including themselves, was unprecedented. 

Turning to the most recent events, the Second Gulf War of 2003 against Saddam Hussein 
had an extremely low transition barrier: the well oiled military machine could be started by a 
single order, and Iraq had no air defense. The transition to the post-war situation was swift and 
smooth. The final state, however, had opened a long-run process, with its own hills and valleys 
of energy landscape, where completely different forces began to act in the war of attrition. Only 
a historian, however, can offer the postmortem of the final state. While history is in the making, 
the final state is in the imagination of the leaders and the public for whom the transition state is 
of a more imminent priority. 

Strictly speaking, historical process has no a priori known final state, which is in full 
compliance with the view of Prigogine. Extrapolating this principle on history of science itself, 
however, we may hope (exactly because the outcome is not known in advance) to have some 
better understanding of patterns of history, including the transition and final states, after 
developing some theoretical foundations and processing the Very Large Chronicle by the 
apparatus of analytical history. History of science is full of realizations of impossibilities. The 
way to such understanding lies through analytical history, which was simulated by an ontological 
question. Similarly, we may ask whether a repertoire of initial, transition, and final states exists. 
The repertoire may turn out much simpler that we could expect because of the locality of change: 
historical patterns involve small configurations, as Grenander’s content involves a tiny fraction 
of the envelope. 

Chemical transformations take time because, due to the Maxwell-Boltzman distribution 
of energies, only a small part of molecules have sufficient energy to cross the barrier of transition 
state (it is called activation energy). Anything that could lower the barrier of a particular 
transformation would enhance its rate. This is the essence of catalytic effect which can be quite 
dramatic. The enzyme catalase increases the reaction rate by over 100 times, which reduces the 
reaction time from years to seconds. This reaction goes to the very end because the removal of 
the forming oxygen makes equilibrium impossible: Peroxide —> water + oxygen , or 
H 2 O 2 —> H 2 O + O 2 . 



35 


A common misconception about chemical catalysis is that it works as in Figure 1, very 
much like two hands of an assembly worker (demon), by taking two pieces, connecting them, 
and an disengaging itself from the assembly. The devil of this picture is in the reversibility of 
time. The catalyst will enhance both direct and reverse chemical transfonnation, or, in terms of a 
match-maker, both marriage and divorce. The mechanism of chemical catalysis is beyond the 
scope of this paper (a quick view of kinetics, catalysis, and transition state can be found on the 
Web). If chemistry moves toward equilibrium, while biochemistry does not, it because of the 
dissipative nature of life: the wheels of biochemical cycles spin in one direction, consuming free 
energy and ejecting heat. The same fully applies to civilization: micro-events could be 
reversible—as the prohibition law of 1920 was—but the overall is not—while dissipation lasts. 

Therefore, we can call catalyst any factor that lowers the transition barrier and inhibitor 
any factor that rises the barrier. Thus, the military attack of USA on North Korea is currently 
strongly inhibited by the geography and resources of the Korean Peninsula, however attractive 
the final bliss of security looks in imagination. The catalyst is the fear of the consequences of a 
North Korean contribution to a nuclear attack on the USA. It is easy to imagine the high 
irregularity of the current transition state and the stress it imposes on the decision makers. 

There is another powerful factor that can lower the transition barrier: high temperature, 
i.e., the intensity of chaos. In times of confusion after a shock of an attack, natural disaster, or 
mass revolt, the high abstract temperature indiscriminately decreases all differences between the 
energy levels and flattens out the energy landscape: anything seems possible and the head of 
Louis XIV falls into the basket. 


8. Illustration: Darius against Scyths 


Let us consider as an example a fragment of ancient—to avoid any contemporary political bias— 
history as told by Herodotus (1955). 

The story of the Persian march on Scythes, around 500 BC, would make a great script 
for an action movie if there were lot of action or at least a love story. In fact, there was mostly 



36 


inaction: Darius did not manage to engage into a battle with the Scyths. And yet the story is full 
of suspense. Herodotus gives a vivid account of the struggle of the minds of the opponents that 
brought the conflict exactly to the point where it started. The story is remarkable by the 
psychological coherence and explanatory details that show the causation of events. It is both a 
history and a literary story. 

The pattern of the story—the expedition and its return without an accomplished goal— 
can be seen, probably, in the Vietnam War and some other conflicts of the last half of the 
twentieth century. 


Herodotus gives the lack of allies and any land to protect as the reasons why the nomadic 
Scythes initially refused to fight. When some of them finally decided to give a battle, it was too 
late: Darius decided to go home. The reasons for his final decision are stated with high 
psychological accuracy. 

When Darius, desperate of the void that meets him in Europe, challenges the Scyths, they 
respond with a cryptic message allowing for a multitude of interpretations. The uncertainty of 
situation brings Darius under stress. He is inclined to see the message as the sign of surrender, 
but Gobryas, his general, sees it as a threat. Next, the equally cryptic behavior of Scyths who 
abandon the battlefield to chase a hare, makes the stress unbearable and shifts the unstable 
transition state of cognitive dissonance toward the unfavorable interpretation of the message. 
This finally prompts exhausted Darius to turn around and go home. 

My purpose here is to trace the mechanism of the expedition as a sequence of stable 
configurations and transition states. 


There are at least three outcomes in a war: victory, defeat, and tie. The latter can ensue as 
result of equal loss or a voluntary avoidance of the battle and return. In the initial simple 
configuration Darius—attack—Scyths we have no data to select the final outcome. We need to 
build up the complexity of the initial state to decide in which direction it could possibly move. 

Thus, Darius is not just a person, but a king, his location is Asia. He possesses a great 
empire and has nothing to lose but a bit of his prestige. The Scyths have nothing to lose because 
they are not tied to the land. This makes the conflict a kind of a cold war. 



37 


At the beginning of the story Darius only contemplates the expedition. Herodotus 
supplies us with almost all available (but not always reliable) knowledge about Darius and the 
Persians, even such details as the thickness of their sculls as compared to those of the Egyptians. 
We can start, for example with the expansion in Figure 7 



Figure 7. Darius and the Scyths: the seed of an initial configuration 

By adding more details we can build an incomplete initial configuration as in Figure 8 . It 
consists of two sparsely connected parts around Darius and the Scyths. A complete one would 
include all we know about the events, protagonists, and their physical, historical, and 
geographical backgrounds. For example, a separate subconfiguration would represent Ionians 
who double-crossed the Scyths. We do not need this, however, because most of the factors are 
either irrelevant or remain unchanged. 

None of the configurations, graphs, tables, and other representations has any claims for 
the truth. My purpose here, following the ideas of Ulf Grenander, is to show how this can be 
done in principle, so that a platfonn could exist for debates possibly leading to a relative 
consensus typical for natural sciences. As I believe, that type of consensus was in the ultimate 
vision of Pitirim Sorokin and Vilfredo Pareto in sociology. In is the same way of naturalization 
social psychology has managed to create a platform for discussion and consensus in the area of 
small groups with the theories of balance (Heider, 1958), cognitive dissonance (Festinger, 1962), 
etc., having a strong chemical flavor. Apparently, Benedict Spinoza was looking in the same 









38 


direction in his Ethics. 

In the absence of quantification, the measurements can be reduced to ordered and 
partially ordered sets and approximate scales, called “naive” in Artificial Intelligence. As a 
curious example with historical flavor I would quote Analects of Confucius (WWW) where the 
Master builds a partially ordered set of ethical values: 


Tsze-kung said, 'What do you pronounce concerning the poor man who yet does not 
flatter, and the rich man who is not proud?' The Master replied, 'They will do; but they 
are not equal to him, who, though poor, is yet cheerful, and to him, who, though rich, 
loves the rules of propriety.' Book I, Chap. XV. 1. 

The Master said, 'They who know the truth are not equal to those who love it, and they 
who love it are not equal to those who delight in it. ' Book 6. Chap. XVIII. 


In the same way, I measure the irregularity—otherwise, tension or stress—of the 
situation on the scale from one to four. 

The Table lists some states of the story. The tension is marked by 1 to 4 red asterisks. 

The moments of irregularity are those uncertain and short-living transition states where, like in 

% 

the chemical transition state [A.C.D] , the rules of regularity are violated. For example, 

when a decision is being made or the protagonists encounter a puzzle that does not fit any 
rational framework. 

The focus of the tension, shown by a red circle, usually has two incompatible arrows 
converging on it (disputes) or two diverging (decisions) arrows of alternatives. Those are 
Prigogine’s moments when the baker again folds and rolls the dough of the process. In the 
Antiquity, the baker was a god. Human decisions and actions, of course, may have zones of 
reversibility. 

Figures 9 and 10 present some stressed intermediate states and the final tension-free 
state of the last decision and a dubious happy end of the safe return. 

The intermittent stable and transition states are shown in Figure 11. The “mountain 
chain” or “roller-coaster” appearance is typical for stories, whether fictional or real. For 
comparison, Figure 12 portrays the roller-coaster of the French Revolution. 





39 


Table: The events and tensions of Darius’ expedition 




Tension / irregularity 

No. 

State. D: Darius, S: Scyths 

No. 

Persians Scyths 

1 

Darius wants to attack Scyths 



2 

Artabanus tries to dissuade D in vain. 

1 

** 

3 

D sets out 



4 

D passes two rivers 



5 

D Passes Ister. Dispute whether to leave 
the bridge 

2 

** 

6 

Scythians confer with local people, 
looking for allies. S decide not to fight D 
in a battle. 

3 

** 

7 

S lure D eastward, over Tanais (Don). 



8 

S direct D into non-allied people, to 
provoke them. Send the wagons north. 



9 

D reaches Oarus (Volga), sees no S, and 
turns back. 

4 

** 

10 

S meet resistance of non-allied people. 

5 

** 

11 

D send a message to S (to Idanthyrsus). 

6 

*** 

12 

S send troops to Ionians at the bridge, 
others decide to fight. 

7 

*** 

13 

Braying of the asses bothers S 

8 

** 

14 

S try to delay D’s departure and help him 
with food. 



15 

a bird, a mouse, a frog, and five arrows 

9 


16 

S confer with Ionians, who agree. 

10 

*** 

17 

Almost a battle, if not for the hare. 

11 

Ns*** 

18 

Gobryas: leave the weak, move to Ister 



19 

S pursue D, going to the Ister but the 
armies miss each other 



20 

The Ionians decide on the fate of D 

12 

** 

21 

Ionians give false promise to S and 
destroy only a part of the bridge 



22 

S believes the Ionians fro the second time 



23 

D finds the bridge broken 

13 

** 

24 

An Egyptian with a loud voice saves D, 
calling, by his order, Histiaeus, who 
responds from the other side 





40 




Figure 8. The initial irregularities of Darius and Scyths configurations (red). 

















































































































































































41 



Figure 9. Intermediate irregularities of Darius configuration. 


Figure 8 consists of two sparsely connected subconfigurations of Darius, State No. 2 in 
the Table, and the Scyths, State No.6. 

Figure 9 presents States 15 and 17, separated in time, as marks t=n and t= n+1 show. 

Figure 10 portrays the 20 and 21, i.e., the process of decision-making by Ionians. The 
final state is still slightly strained by contradictory interests of power and independence. 

Figure 11 corresponds to the final state of Darius configuration. 

Figures 12 and 13 illustrate the landscape of tension (i.e., relative energy) in a series of 
relatively stable and transitional configurations for the march of Darius and the French 
Revolution. 

It must be emphasized, that the illustrations aim at the goal of building a platform of 
consensus, but not the consensus itself, which should be left to specialists. 


























































42 




Figure 10. The transition (above) and final (below) states of Ionians. 

















































































































































43 



Figure 11. The final state of Darius configuration. 



Figure 12. The tension landscape of the Darius’ expedition. Point 17 marks a border between 
two components: march and return. 


























































44 



Figure 13. The tension profile of the French Revolution 


CONCLUSION 

As the historical sketch reveals, a mostly disconnected diaspora of natural scientists interested in 
history has been forming for some time around the legacy of Rashevsky, Richardson, Tilly, and 
others, starting from Democritus, Lucretius, and Leibniz.. Analytical history of Roehmer and 
Syme, as well as the study of patterns of military conflicts of Brecke, have a potential of 
becoming active centers of the “naturalization” of historical research. 

A conundrum of the existing formal approaches to complex systems is the use of closed 
mathematical structures for representing open irreversible systems. Pattern Theory is suggested 
as another entry in the inventory of approaches to history. It could be a member of a future club 
of complementary concepts and methods. 

Pattern Theory, with its atomistic realism, flexibility, and preservation of semantics, is 
uniquely positioned for developing a general representation and modeling of Very Complex 
Open Systems, such as life, mind, and society. The first point of the application of Pattern 
Theory to history can be transition state characterized by its irregularity. 

Analytical history and large historical databases can create the medium where the ideas 
of Pattern Theory can be tested. 




45 


REFERENCES 

See also http://spirospero.net/complexity.htm 

ASA. American Sociological Association's. Section for Mathematical Sociology. 

http://www.sscnet.ucla.edu/soc/groups/mathsoc/mathsoc.htm 
Ankersmit, F.R. 1994. History and Tropology: The Rise and Fall of Metaphor. Berkeley, Los 
Angeles, London: University of California Press. 

Bertalanffy, Ludvig von. 1968. General systems theory. New York: George Braziller. 

Braudel, Fernand. 1992. The Mediterranean: And the Mediterranean World in the Age of Philip 
II. New York: HarperCollins. 

Brecke, Peter. 2002. Notes on Developing a Human Security/Insecurity Index. 
http://www.inta.gatech.edu/peter/hsi.html 

- 2001. The Long-Term Patterns of Violent Conflict in Different Regions of the World. 

www.pcr.uu.se/Uppsala_paper_Brecke.pdf 

- WWW. My Research Projects and Datasets, http://www.inta.gatech.edu/peter/. 

- and Long, William. J. WWW. War and Reconciliation. 

pro.harvard.edu/papers/005/005005LongWillia.pdf 

Bremermann, Hans. 1974. Complexity of Automata, Brains, and Behavior, Physics and 
Mathematics of the Nervous System, In: M. Conrad, W. Guttinger, and 
M. Dal Cin (yds.),Biomathematics Lecture Notes, Vol. 4. 

Heidelberg: Springer, p. 304-331. 

http://www.aeiveos.com/~bradbury/Authors/Computing/ 

Bremermann-HJ/CoABaB.html 

Bennett, Charles. 1973. Logical Reversibility of Computation. IBM Journal of Research and 
Development, Vol. 17, 1973, pp. 525-532. 

http://www.aeiveos.com/~bradbury/Authors/Computing/Bennett-CH/LRoC.html 
Bourbaki, Nicolas. 1968. Elements of Mathematics: Theory of Sets, Boston: Addison- 
Wesley, originally published by Hermann (Paris), 1968. 

Collins, Randall. 1998. The Sociology of Philosophies: A Global Theory of Intellectual Change. 






46 


Cambridge, Mass.: Belknap Press. 

Confucius. WWW. Analects. http://classics.mit.edu/Confucius/analects.html 

De Landa, Manuel. 1997. A thousand years of nonlinear history. New York : Zone Books. 

Dawkins, R. 1989. The Selfish Gene. New York: Oxford University Press, 
chemistry/kinetics, html 

Dodge, Martin and Kitchin, Rob. 2001. Atlas of Cyberspace, Boston: Addison Wesley, 
http ://www.cybergeography, org/atlas/atlas .html 

Edling, Christopher R. 2002. Mathematics In Sociology. Annual Review of Sociology, Aug 2002, 
Vol. 28, pp. 197-220. 

Ehresmann, A.C. and Vanbremeersch J.P. 1987. Hierarchical Evolutive Systems: 

A mathematical model for complex systems, Bull, of Math. Bio. 49 (1) (pp. 13-50). 

- 1991. Un modele pour des sysfemes evolutifs avec memoire, base sur la theorie des 

categories, Revue Intern, de Systemique 5 (1), (pp. 5-25). 

Einstein-Besso. 1972. Correspondance, edited by Speziali. Paris: Herman, pp. 537-39. 
S.Eilenberg, S. and MacLane, S. 1945. Trans. Am. Math. Soc., 58, 231, 1945. 

Eyring, H. and Polanyi, Z, 1931. Phys. Chem. B 12, 279 (1931). 

Festinger, L. 1962. A Theory of Cognitive Dissonance, Stanford: Stanford University Press. 
Foucault, Michel. 1970/1966. The Order of Things: An Archeology of the Human Sciences. New 
York: Random House. 

Franzen, Jonathan. 2001. The Corrections, New York: Farrar, Straus and Giroux, NY, p.66. 
Fukuyama, Francis. 1992. The End of History and the Last Man. New York : Free Press. 
Grenander, Ulf. 1976. Pattern Synthesis. Lectures in Pattern theory. Volume 1. New York: 
Springer-Verlag. 

- 1978. Pattern Analysis. Lectures in Pattern theory, Vol. II. New York: Springer-Verlag. 

-1981. Regular Structures.Lectures in Pattern theory, Vol. III. New York: Springer-Verlag. 

- 1993. General Pattern Theory. A Mathematical Study of Regular Structures, Oxford, New 

York: Oxford University Press. 

- 1995. Elements of Pattern Theory. Baltimore: Johns Hopkins University Press. 

-2003. Patterns of Thought. www.dam.brown.edu/ptg/REPORTS/mind.pdf 



47 


Gurevich, Yuri. 1995. Evolving Algebra 1993: Lipari Guide, in Specification and Validation 
Methods, Ed. E. Borger, Oxford University Press, pp. 9-36. Also present on the Web. 

Heider, Friz. 1958. The Psychology of Interpersonal Relations. New York: Wiley. 

Herodotus. 1955. The History of Herodotus, Thucydides, The History of the Peloponnesian 
War. Chicago, London, Toronto: William Benton, 1955 

Jaffe, Charles, D. Farrelly, D., and T. Uzer, 2000. Transition State Theory without Time- 
Reversal Symmetry: Chaotic Ionization of the Hydrogen Atom. Physical Review 
Letters, 84, 610-613. 

Kagan, Donald. 2003. The Peloponnesian War. New York: Viking. 

Kennedy, Paul. 1987. The Rise and Fall of the Great Powers: Economic Change and Military 
Conflict from 1500 to 200. New York: Random House. 

- 1993. Preparing for the Twenty-First Century, New York: Random House , 

Russell, Charles. J. WWW. Basic Reaction Kinetics. 

http://www.cchem.berkeley.edu/~cheml30a/sauer/outline/kinetics.html 

Krempel, Lothar. l.A Gallery Of Social Structures (http://www.mpi-fg- 
koeln.mpg.de/~lk/netvis.html) 

- 2. Internationale Arbeitsteilung und globale okonomische Prozesse: Analysen des 

Welthandels von Autos mit Handelsdaten 
http://www.mpi-fg-koeln.mpg.de/~lk/netvis/global/ 

Landauer, R.: Irreversibility and heat generation in the computing process, IBM Journ. Research 
andDevelopm. 5, 183-191 (1961). 

http://www.aeiveos.com/~bradbury/Authors/Computing/Landauer-R/IaHGitCP.html 

Lane, M. 1970. Introduction to Structuralism. New York: Basic Books. 

Lotka, A.J. 1956| 1925. Elements of Mathematical Biology. New York: Dover. 

Long, William J. and Brecke, Peter. 2003. War and reconciliation : reason and emotion in 
conflict resolution . Cambridge, Mass. : MIT Press. 



48 


Leibniz, G. W. 1973. Monadology. In Philosophical Writings, trans. Morris and Parkinson. 
Everyman. 

Mack, Gerhard. 1996. Gauge Theory of Things Alive: Universal Dynamics as a Tool in Parallel 
Computing. Prog. Theor. Phys. (Suppl. 122)201 (1996) 

- 1995. Gauge theory of things alive, Nucl. Phys. B42:923. 

-2001. Universal Dynamics, a Unified Theory of Complex Systems. Emergence, Life and 

Death. Communications in Mathematical Physics, 219, No.l, (141 - 178). 

- WWW. Web li nk s to other works: http://lienhard.desy.de/ ; 

http://lienhard.desy.de/call.shtml7sy_l; http://lienhard.desy.de/call.shtml7sy_3. 

Mac Lane, S. 1971. Categories for the Working Mathematician. Berlin: Springer-Verlag. 
Moriyasu, K. 1983. An Elementary Primer for Gauge Theory. Singapore: World Scientific. 
Nehaniv, Chrystopher and Dautenhahn, Kerstin. 1998-1. Semigroup Expansions for 

Autobiographic Agents. Proceedings of the First Annual Symposium on Algebra, 
Languages and Computation, (University of Aizu, 30 October-1 November 1998), Japan. 
Editors: T. Imaoka and C. Nehaniv, pp. 77-84 . 
http ://homepages .feis.herts .ac .uk/~comqkd/ alcnote2 .ps 

- 1998-2. Embodiments and Memories - Algebras of Time and History for 

Autobiographic Agents. Proceedings of 14th European Meeting on Cybernetics and 
Systems Research, [Vienna, 1998], Editor: Robert Trappl, pages 651-656. 
http://homepages.feis. herts.ac.uk/~nehaniv/EM6pp/EM6pp.html;http://homepages.feis. 
herts. ac.uk/~comqkd/ em6pp .p s. 

Nicolis, G. and Prigogine, I. 1989. Exploring Complexity. New York: W.H.Freeman. 

Prigogine, Ilya and Stengers, Isabelle. 1984. Order out of Chaos . New York: Bantam. 

Prigogine, Ilya. 1983. The Rediscovery of Time. A discourse prepared for the Isthmus Institute. 

http://www.magna.com.au/~prfbrown/ilyatime.htm 
Rashevsky, Nicolas. 1968. Looking at history through mathematics, Cambridge, Mass., M.I.T. 
Press. 

- 1959. Mathematical biology of social behavior. Chicago: University of Chicago Press 



49 


- 1960. Mathematical biophysics: Physico-Mathematical Foundations of Biology , Vols. 

1, 2, New York: Dover (1960), 

Rosen, Robert. 1985-1. Anticipatory Systems: Philosophical, mathematical and Methodological 
Foundations, Oxford: Pergamon Press. 

- 1985-2. Robert Rosen, Editor. Theoretical Biology and Complexity: Three Essays on 

the Natural Philosophy of Complex Systems. Orlando, San Diego etc.: Academic Press. 
[Contains: I.W.Richardson, The Dynamics and Energetics of Complex Real System', 

A.H.Louie, Categorical System Theory', Robert Rosen, Organisms as Causal Systems 
Which Are Not Mechanisms: An Essay into the Nature of Complexity. ] 

-1991. Life Itself. New York: Columbia University Press. 

-2000. Essays on Life Itself. New York: Columbia University Press. 

Richardson, Lewis Fry. 1960. Arms and Insecurity: A Mathematical Study of the Causes and 
Origins of War, Edited by Nicolas Rashevsky and Ernesto Trucco, Pittsburg and 
Chicago: The Boxwood Press and Quadrangle Books. 

- 1960. Statistics of Deadly Quarrels. Edited by Quincy Wright and C. C. Lienau. 

Pittsburgh:Boxwood Press. 

- 1993. Collected Papers of Lewis Fry Richardson. Edited by Oliver M. Ashford, et al. 

New York: Cambridge University Press. 

Roehner, Bertrand M. and Syme, Tony. 2002. Patterns and Repertoire in History, Cambridge, 
MA, London, England: Harvard University Press, 2002. 

Schank, Roger C. and Abelson, Robert P. 1995. Knowledge and memory: the real story. In 

Robert S. Wyer, editor, Knowledge and memory: the real story, pages 1—85, Hillsdale, 
New Jersey: Lawrence Erlbaum Associates. 

Sorokin, P. A. 1937. Social and Cultural Dynamics. 4 vols. New York, Cincinnati: 

American Book Company, 1937-41. 

Stengers, I. and Prigogine, I. 1997. The End of Certainty : Time, Chaos, and the New Laws of 
Nature, New York: Free Press. 

Tarnopolsky, Yuri. 2003. Molecules and Thoughts: Pattern Complexity and Evolution 

in Chemical Systems arid the Mind . www.dam.brown.edu/ptg/REPORTS/MINDSCALE.pdf 
or: http://spirospero.net/MINDSCALE.pdf 










50 


See also http://spirospero.net/complexitv.htm 

Tilly, Charles. 1981. ^4^ Sociology Meets History New York: Academic Press. 

- 1984. Big Structures, Large Processes, Huge Comparisons. New York: Russel Sage 

Foundation. 

Thom, Rene. 197511973. Structural Stability and Morphogenesis: An Outline 

of a General Theory of Models. Reading, Massachusetts: Benjamin-Cummings 
Publishing. 

Vitanyi, Paul. Tolstoy's Mathematics in “War and Peace". 
http://www.illc.uva.nl/j50/contribs/vitanyi/ 

Volterra, Vito. 1931. Leqons sur la Theorie Mathematique de la Lutte pour la Vie. Paris: 
Gauthier-Villars. 

Wallerstein, Immanuel. 1998. Presidential Address, XIVth World Congress of Sociology, 
Montreal, July 26, 1998. http://fbc.binghamton.edu/iwprad3.htm 

War. WWW. On-line Papers and Briefs : Nonlinearity, Complexity, Combat & War . 
http://www.cna.org/isaac/on-line-papers.htm 




The Chemistry of Protolanguage 


TIKKI TIKKI TEMBO 

The Chemistry of Protolanguage 


Yuri Tarnopolsky 


I 

B C 

\ / 

D 

/ 

E -F 

T 



i 

i —i 

B—D—C—E—F 
l~ I 

t 


2004 


© 2004 Yuri Tarnopolsky 

yuri@spirospero.net 


Last revised: July 11, 2005 














2 


Keywords: protolanguage, language origin, language evolution, speech generation, 
chemistry, chemical nomenclature, linearization, Pattern Theory, Transition State Theory, 
complexity, Ulf Grenander, George Zipf, Noam Chomsky, Joseph Greenberg, Mark 
Baker, Manfred Eigen, Ilya Prigogine, Walter Ross Ashby, Rene Thom, George 
Hammond, Robert Rosen, axiom of closure. 

ABSTRACT 

Protolanguage (Derek Bickerton) in linguistics corresponds to an 
evolutionary stage preceding the grammaticalized language as we know it. 

It could be possible to reconstruct the principles of protolanguage by 
turning to most general principles of evolution in a larger picture, of which 
chemistry is a relevant part. Both linguistics and chemistry are discrete 
combinatorial systems. Considering the chemical origin of life, chemical 
analogies might offer some insight into the origin of mind, language, and 
society, all of which developed on the platform of life. The conceptual 
basis for discrete combinatorial systems, including chemistry and 
language, can be found in Pattern Theory (Ulf Grenander) where ideas, 
utterances, and molecules are configurations. To draw the parallel further, 
chemistry uses its own language of chemical nomenclature to represent 
non-linear molecular structures as linear strings of symbols. Chemistry 
pays particular attention to the intimate mechanisms of structural 
transformations. A tentative concept of the mechanism of protolanguage 
generation is suggested as kinetically controlled linearization of a typically 
non-linear observable configuration through a non-observable thought. 
Generation of linear expressions in protolanguage is viewed as a process 
of generalized chemistry, going from a typically non-linear initial state 
through a transition state toward the linear output, under the constraint of a 
maximized preservation of configuration topology. 



3 


CONTENTS 

1. Introduction 4 

2. Preview of main ideas 8 

3. Chemistry and linguistics: sister sciences 13 

4. Noam Chomsky and Joseph Greenberg 16 

5. Chinese and Chemicalese 25 

6. Rene Thom and images of change 32 

7. Configurations, patterns, and Nean 40 

8. Some risky ideas about mathematics and life 49 

9. Chemolinguistry: a chimera 52 

10. Tikki Tikki Tembo: language as a form of life 64 

11. Zipfing the chimera 69 

12. A chemist and a chimp speak Nean 78 

13. Scenes from the cave life told in Nean 84 

14. Concluding remarks 98 

15. APPENDIX 

15.1 Example of Chemicalese 101 

15.2 Examples of real-life large configurations 102 

15.3 The chemical view of the world 103 

15.4. Program nean 110 

References 111 



4 


The words and verses differ, each from each. 
Compounded out of different elements... 

Lucretius (De rerum natura, II) 

The order and connection of ideas is the 
same as the order and connection of things. 
Spinoza ( Ethica , I, VII) 


1. INTRODUCTION 


Mark S. Baker in his book The Atoms of Language (Baker, 2001) drew a consistent 
analogy of linguistics with chemistry. Within the Principles and Parameters framework, 
a language is similar to a chemical element in the sense that it is a combination of certain 
parameters. Baker acknowledged that most people would associate words with atoms of 
language, but he simply put this view aside as “correct—in one sense.” (Baker, 2001, p. 
51). He was, of course, right, pointing to the Periodic System as a metaphor for the 
combinatorial nature of language. Moreover, the book manifested the true chemical spirit: 
it was built around numerous observable linguistic examples, as any typical chemical 
monograph is built around hundreds of structures and their transformations. 

For a linguist with comparative interests, a large part of the fun of doing linguistic 
research is searching out all the fascinating, deep, and intricate differences in how 
languages work. Indeed, the only thing that gives a comparable thrill is discovering the 
deep and fascinating ways in which they are all the same. (Baker, 1995) 

The above quotation reveals to me a kindred soul: I could sign it if linguist was 
substituted for chemist and molecules for languages because chemists deal with concrete 



5 


individual and observable objects, while parameters are rather abstract structural 
invariances. 

It would be unbecoming for a linguist to say “It’s Greek to me” about chemistry, 
anyway, but there is some genuine feeling of kinship between both areas. Linguists evoke 
chemistry as the science epitomizing not only complexity but also its successful conquest, 
The kinship was prophesized long before the birth of chemistry and after the birth 
of the famous Greek Democritus: 


The words and verses differ, each from each, 

Compounded out of different elements— 

Not since few only, as common letters, run 
Through all the words, or no two words are made, 

One and the other, from all like elements, 

But since they all, as general rule, are not 
The same as all. Thus, too, in other things, 

Whilst many germs common to many things 
There are, yet they, combined among themselves, 

Can form new who to others quite unlike. 

Thus fairly one may say that humankind, 

The grains, the gladsome trees, are all made up 
Of different atoms (Lucretius, 1958, Book II). 

Chemistry, on its part, has been using extensive linguistic parallels for nucleic 
acids and proteins since the discovery of their relation. Moreover, much earlier, chemist 
ry developed its own tongue with a lexicon heavily borrowed from Greek and a refined 
grammar with codified flexions and word order. 

Mark Baker’s book was the last drop into the bucket of observations that I had 
accumulated over a significant time. This paper is an attempt of a chemist to view words 
as atoms of a chemistry. 

I am a chemist without any linguistic credentials whatsoever, but with a life long 
interest in languages. I am, to a variable degree, familiar with properties of such 
languages as Russian (native), English (current), Gennan (studied at school), French, 
Hungarian, Japanese, Hebrew, and a few others. My very limited hands-on experience 
with the non-Indo-European languages, as well as a better (but by no means perfect) 



6 


knowledge of both Indo-European but diametrically opposite English and Russian, 
persuaded me that, with all the striking differences in their design, all languages perform 
the same function with the same means. Neither the opulence of Bantu languages, with 
their classifiers and suffixes, nor the intricately woven ribbons of the Na-Dene verbs 
could shake my conviction. The function is a representation of a non-linear “source,” 
whatever it is, and the means is an optimal linearization of the non-linear representation. 
When I had looked into linguistic literature, I found plenty of support. 

The vocal-auditory channel has some desirable features as a medium of 
communication: it has a high bandwidth, its intensity can be modulated to conceal 
the speaker or to cover large distances, and it does not require light, proximity, a 
face-to-face orientation, or tying up the hands. However it is essentially a serial 
interface, lacking the full two-dimensionality needed to convey graph or tree 
structures and typographical devices such as fonts, subscripts, and brackets. The 
basic tools of a coding scheme employing it are an inventory of distinguishable 
symbols and their concatenation. (Pinker and Bloom, 1990). 

For over twenty years I have been watching the development of Pattern Theory 
(Grenander, 1976-2003), sometimes from a close distance, regarding it as a general 
approach to complex systems consisting of atom-like elements and connecting bonds. It 
became clear to me that this mathematical theory of everything nicely covered not only 
molecules and languages but also every discrete combinatorial system we could come in 
touch, and did it with an unprecedented combination of generality and realism. 

Furthennore, I have witnessed the entire genesis and evolution of the science of 
complexity, starting from Prigogine (1984) , who fonnulated the most fundamental 
principles of complex natural systems such as life, mind, and society, and further toward 
Artificial Life (Adami, 1998) where languages and molecules were of the same kin 
already at the inception (Eigen, 1971-1979). “Natural” here is the opposite of “artificial,” 
such as virtual reality where people can walk on the ceiling and turn into wolves right 
before your eyes. 

I am also familiar with the language of musical notation, a couple of 
programming languages, and, due to my profession, with the curious language of 



7 


chemical nomenclature invented by organic chemistry to verbally communicate the 
non-linear molecular structure. 

Finally, the language of poetry—rarely spoken in everyday life—is my bonus 
pass to a gym where one can exercise linking distant meanings and close sounds. 

The enonnous literature comprising computational, formal, traditional, and 
historical linguistics, Artificial Intelligence, Artificial Life, mathematical structures, 
physics of open systems, chemistry, and details of Pattern Theory is probed here only 
highly selectively and superficially. The growing but still manageable bibliography on 
language evolution and computation has been nicely collected and presented at the 
website of University of Illinois at Urbana-Champaign (Language Origin, WWW). 

My intent is not to formulate a theory—this should be entrusted to 
professionals—but to offer a new (but organically grown!) spice for the boiling cauldron 
of linguistic ideas. Whoever likes the aroma can use it for meditation, inspiration, and, 
who knows, for some fun time after a Ph.D. thesis. I wish to share a widest and most 
comprehensive—an illusive but honorable goal—view of the intellectual jungle where 
linguistics and chemistry are of the same blood. In short, if it all boils, then down and up 
to Pattern Theory. 

I believe that my outsider status, as well as the claim for a larger picture, grants 
me the privilege of choosing my own far-from-academic style—which is just being 
natural. I cannot walk on the ceiling. But I have another uncommon gift: I see the world 
with the eyes of a chemist. 

I further refer to the following key figures of painting a large picture with 
language in the landscape: Lucretius, Ulf Grenander, George Zipf, Manfred Eigen, Ilya 
Prigogine, Walter Ross Ashby, Rene Thom, George Hammond, all of them, except 
Lucretius and Zipf, natural scientists and mathematicians. I mention other profound 
linguistic thinkers in the main text. I am sure most of the ideas of this paper can be found 
in the literature and I apologize if I failed to find them. 

I widely use the WWW sources where it is possible. They may die out with time, 
but a peculiar life-like property of the Web is that the new ones will be cooked, could be 
searched for, and found, garnished with ads. For better or worse, money will never be out 
of the larger picture but I hope it will not fill up the entire canvas. 



8 



9 


2. PREVIEW OF MAIN IDEAS 


My initial thesis is: to compare language and chemistry we have to view them as natural 
phenomena within a larger picture. 

Language is embedded in human psychology and society, and is ultimately governed by 

the same principles as galaxies and mesons. (Hurford, 2003, p.38) 

We shall look at both from a more general view than either chemistry or 
linguistics and in both linguistic space and evolutionary time. 

From the evolutionary perspective, there must be a fundamental truth in the 
concept of protolanguage (Bickerton, 1990, 1995, Calvin and Bickerton, 2000), from 
which the full-blown languages evolved. Bickerton’s earlier vigorous and polemic book 
(Bickerton, 1981) was full of important large-picture ideas, some of which will be echoed 
here. The recent collective volume (Language Evolution, 2003) with Bickerton’s 
contribution, summarizes the current status of the problem and will be often referred to. 

Protolanguage is not just a playground for imagination. The use of protolanguage, 
within the framework of algorithmic AI, has been discussed and attempted for a 
simplified communication between a human and a computer or robot, see for example, 
Varshavskaya (2002) and Billard (2002), but the results were not too encouraging. 

If there is something truly universal for all languages, from pre-protolanguage to 
the modern street slang to the flashy lingo of The New Yorker art reviews, and from 



10 


English to Mohawk, it must be so not only in space but also in time. Here we will be 
looking for a universal property of modern language, applicable also to 
protolanguage and its subsequent evolutionary record. By definition, it is something 
that cannot be found in either extant or extinct formal structures, but only diachronically: 
along the time axis. 

If we deal with a non-grammatical protolanguage we have to abandon ah the 
theories of fully developed language, together with ah the contrived and artificial 
examples and even the entire ontology of the past several millennia of culture. This is a 
big relief because it is difficult to find a non-trivial linguistic statement which has not 
been contested during the linguistic wars in which a chemist has no part. 

The form of dialogue, which had lost its appeal since Plato, Galileo, and Bishop 
Berkeley, was revived by Juan Uriagereka (1998) in his popularization of minimalist 
syntax. It inspired me to design the following introductory exchange, where vague words, 
like “typical” or “source,” are used to avoid obdurate head-on statements and 
meaningless terms such as meaning (why meaningless? to define meaning, you have first to define 
meaning). 

1. Q. What is typical for natural sciences? 

A. They deal with observable objects and processes. 

2. Q. What is typical for a natural process? 

A, A certain parameter, for example, energy, changes in a preferred direction, 
unless an external influence prevents the natural course. The apple naturally falls 
down, if not caught by a human hand. 

3. Q. What is typical for molecules and utterances? 

A. Both are structures. Most generally, structure (not in the sense of 
“mathematical structure”) is a set of elements and a set of pairs, i.e., connections 
between some of them. Graph, especially colored and labeled, i.e., with values or 
markers at arcs and nodes, is a fair image of structure. Configuration of Pattern 
Theory is a better one. 



11 


4. Q. Atoms interact. How do words interact? 

A. They fonn linear strings: utterances. Some strings stick together, which 
reveals the affinity of the words to each other, and participate in verbal exchange. 
Other strings do not hold together and cannot be used. 

5. Q. What is the immediate source of the utterance? 

A. Thought. We do not know what it is except that it is a mental state or 
process. A thought, unlike an apple, is never shared and never observed—yet. 

(Very fortunately. But for how long?). 

6. Q. What is the source of the thought? 

A. Observable reality: thing, situation, object, process, relation, sensation, 
information, sign, phrase, text, image, cue, signal, question, remark, utterance— 
anything that could be shared or witnessed by at least two people. 

8. Q. What is the relation between a thought and its expression as an utterance? 

A, Since we have no means of observing a thought, we have to go to the 
source of the thought outside the individual mind. The image of the source 
preserves topological relations between components of observable reality. The 
structure of thought is, hypothetically, not necessarily linear. Structures in PT 
(configurations, images, patterns) and chemistry are typically non-linear. 

9. Q. What happens when an utterance in protolanguage or language is 

generated from its source? 

A. Linearization of a typically non-linear configuration. In any case it must 
happen somewhere between the source and the expression like “I see Og take Ug 
meat” using Bickerton’s example (Calvin and Bickerton, 2002). 



12 


10. Q. What is protolanguage? What is full language? 

A. Protolanguage is a linearization of thought in which the binary 
connections in the source are represented by pairs of adjacent words, which may 
not be possible for all pairs in the source. 

Language is same as protolanguage, but the connection can be expressed, 
in addition to word order, by means of morphemes, regardless of the adjacency. 

11. Q. How does language emerge and evolve from protolanguage? 

A, By mutation and replication in populations of utterances communicated in 
a social group. “You say potahto, I say potato.” Language is a form of life. You 
say nukelar, I say nuclear. 

12. Q. What is typical for a form of life? 

A. In addition to replication and mutation, there is the important property of 
homeostasis, i.e. the ability to restore peace after turbulence and to minimize a 
deviation from “the middle road,” often by taking another evolutionary pathway. 

13. Q. What do language and chemistry have in common, apart from being 

discrete combinatorial systems? 

A. First, there is a stage of the fleeting and non-observable thought in 
language generation (it cannot even be remembered if not put into words) and 
there is a fleeting and typically non-observable transition state in a chemical 
transformation. Second, chemistry has a distinct language of its own, intended to 
linearize nonlinear chemical images. 

Next, some points of the dialogue will be expanded in a series of chapters, 
without going into too much detail, because what truly relates both chemistry and 
linguistics is the zillion devils in the details. 

Finally, the above dialogue will be illustrated by computer-aided examples of 
linearization at the level of a hypothetical protolanguage called Nean. The computer, 
however, will act in the dumbest role of a generator of random numbers and could be 



13 


substituted just by flipping coins. To attribute any algorithm to a genesis of a natural 
system equals to designating a creator. 

My concluding thesis is that an almost mindless process of linking two atomic 
names together (Ug big) because the atoms are li nk ed in the source, could be sufficient to 
launch protolanguage. The random and mindless process of natural selection in the 
populations of utterances could be sufficient to launch the evolution of full language with 
its rainforest exuberance of species. 



14 


3. CHEMISTRY AND LINGUISTICS: SISTER SCIENCES 


It is no wonder that chemistry and linguistics could be jointly found at over 800,000 web 
pages (in 2003) because any university is a natural place for them to rub shoulders. It is 
much less common to meet both on the same page of a scientific text. Nevertheless, an 
acquaintance has been struck about half century ago when chemists compared strings of 
amino acids and nucleotides with lines of text. Today the Web search for DNA + 
language delivers almost 1 million sites. “DNA linguistics,” as it is called, has become a 
largely independent direction of research (Searls, 1992). 

The time around 1950 was a period of extensive planting of new ideas. The 
concept of Artificial Intelligence, including automatic translation, was formulated. The 
application of the idea of transition state to molecular structure by George Hammond 
completely revolutionized chemistry. It was also the time of the articulation of the 
science of complexity by Ilya Prigogine, the arrival of the “extravagant” and 
“controversial” ideas of George Zipf, and the advent of the formal linguistics of Noam 
Chomsky. (1 believe that the incredibly fruitful time was the aftershock of the WW11: the soil was 
fertilized with the ashes). 

Marc Baker’s book (Baker, 2002) is only one evidence of the interest in chemistry 
on the part of linguists, although the most significant one. There are other examples of a 
mutual curiosity. 

At the conference Language as an Analogy in the Natural Sciences held in 
Munich in 1997, the chemist Pierre Laszlo presented an essay Belaboring the Obvious: 



15 


Chemistry as Sister Science to Linguistics. Laszlo (1997) emphasized binary mixing and 
combining as the way chemical experience was acquired throughout history. 

L. Vlasov compared more common in nature metals and less common non-metals 
with consonants and vowels, expanding the following metaphor: 

Nature speaks to us in the language of chemical compounds. And each of these is a sort 
of combination of chemical "letters," or elements occurring on Earth. The number of 
such "words" exceeds three million. But there are only just over a hundred "letters" in the 
chemical "alphabet." (Vlasov, 1970, Story No.20). 

The following was addressed to students of chemistry: “Learning organic 
chemistry is like learning a new language, a language that is both verbal and pictorial” 
(Ege, 1989, p.2). So, don’t be afraid. Chemistry is fun. 

Chomsky (2002) used the comparison of the burgeoning descriptive linguistics 
with chemistry to express optimism in developing a compact theory after a sufficient 
body of knowledge has been accumulated. 

Chomsky noted that chemistry had achieved understanding of the nature of its 
invisible objects through the union with physics, which has not shaken the factual 
knowledge itself. It implies that a further advance in linguistics may depend on including 
it into a larger picture of the world drawn by natural sciences. 

Mark Willems entitled his doctoral thesis in graph-theoretical study of semantics 
Chemistry of Language (Willems, 1993). Following his lead, Liu (2002) took up The 
Chemistry of Chinese Language as the title of his thesis. Both dissertations expand the 
research in “knowledge graphs” initiated by Cornelis Hoede at the University of Twente, 
the Netherlands. His compact lecture, available on the Web, is an excellent introduction 
into graph theory, knowledge graphs, and the larger picture (Hoede, 2003). 

There are countless ways of representing knowledge, meaning, and logic of an 
expression. The vast literature on ways and uses of representing a sentence as a graph 
makes me think that there is simply no way to prefer one to another and more can be 
invented. It also tells me that we do not know what thought is but are afraid to admit it. 



16 


The concept of chemical structure has been ingrained into chemical thinking for 
almost 150 years. Generically, it means the following (wording could slightly differ): 

Properties of a chemical compound depend on the properties of the constituting 
atoms and the way they are put together. 

Compare that with: 

The meaning of an utterance is some function of the meanings of parts of that 
utterance and the way they are put together (Kirby and Christiansen, 2003). 

Quoting the Nobel Lecture of one of the founders of modern chemistry (Pauling, 1954): 

In 1861 Butlerov, making use for the first time of the term "chemical structure", stated 
clearly that the properties of a compound are determined by its molecular structure, and 
reflect the way in which atoms are bonded to one another in the molecules of the 
compound. 

Probably, much more scattered testimonies of affinity could be found, but the last 
one is strong enough. 

I would add some personal observations. Most linguists and chemists deal with 
material evidence: speech and text are as completely observable as molecular structure— 
at least by the modem means of analysis. Theory in both areas, however, deals with 
structures beyond observation. Finally, the material evidence is enormous in size, but by 
no means infinite, whatever the linguists may say in their strange and pervasive obsession 
with infinity. See, for example, Studderd-Kennedy and Goldstein (2003). 


17 


4. NOAM CHOMSKY AND JOSEPH GREENBERG 

Linguists sometimes speak the language of chemistry without even realizing it. But can 
we speak the language of linguistics? 

In Molier’s Le bourgeois gentilhomme, a play with linguistic connections, 
Monsieur Jourdain was surprised to leam that he had been speaking prose all his life. A 
shock—rather than surprise—may expect a natural scientist entering the world of fonnal 
linguistics explaining how we speak, prose or poetry. 

What follows are some impressions of a skeptical ignoramus like myself, pumped 
up for the sake of performance but without any malice. I am just trying to prepare the soil 
of Pattern Theory for the seeds of both chemistry and linguistics. 

The Rhyme and Reason by Juan Uriagereka (1998) was intended as an 
encyclopedia of formal linguistics for an outsider like myself. It reads as sometimes 
irritating but irresistible spoof of Dante’s Inferno. Any delusions about the language one 
has been mindlessly speaking since childhood are supposed to be cured by elephantine 
doses of proprietary purgative terminology, strangely duplicating the most common 
words of our daily usage. Here is an example I found in References (Uriagereka, 1998, 
p.636): 


Uriagereka, J. 1994. Government restrictions on Basque movements. In Syntactic theory 
and Basque syntax, edited by J.I.Hualde and J. Ortiz de Urbina. University of the Basque 



18 


When (in 2003) I search the Web for all the words in the title, over 2600 links pop 
up. When I subtract from the search linguistics and Uriagereka, the remaining 2450 
links deal with political restrictions on Basque separatism in Spain. 

To me, an ignoramus, the minimalism sounds as a maximalist version of Zen 
Buddhism with koans like “empty category is not empty.” The giant book’s layout 
leaves almost half of it just blank, making one wonder if this is a hint to solving the koan. 

For an outsider it is hard to chase away the prophetic vision of Herman Hesse in 
his The Glass Bead Game, which, in my opinion, anticipated the postmodern world 
obsessed with repertoire and perfonnance and indifferent to substance and spirit. A look 
at the original works of the formal school shows the state of pennanent flux, debate, 
elimination, and invention, for which the Magister Ludi himself is not always responsible. 
Paraphrasing Goya, the sleep of context multiplies monsters. 

The Chomskian linguistics looks to a chemist like a Gothic bestiary of creatures, 
one more bizarre than another, living in a haunted castle where the zombies of Urate 
and (Emptp Catecjorp drag their ghostly existence. This world is not for the faint of heart: 
like in a horror movie, as soon as you, Pottnb in Cljatns!, have beJIeabed the common 
fHerge, the towering ^ttpermerge assails you from behind. Furthennore, after you have 
somehow tackled the midsize Habel, a peewee ^>ublahel crawls into your pants, and 
behind the disabused projection, the monster of i§>uperprojectton JXatses on its hind legs. 
But wait, there is also the bloodcurdling i§>uperrat£!tng, even before the end of Part One. 

Well, I get some understanding, if not support, from an outstanding linguist: 

We all speak at least one [language] , that one we acquired without a lick of conscious 
effort, and most non-linguist, in the unlikely event that they opened a copy of Linguistic 
Inquiry or Natural Language and Linguistic Theory only to find stuff every bit as hard 
going as genetics or quantum mechanics, would in many cases react by saying ‘What’s 
all this nonsense about? Why are they making such a fuss about something that’s 
perfectly simple and straightforward?’ Bickerton (2003, p.77). 



19 


The growing area of language evolution and computation, however, carries a clear 
promise to create a new, lighter, funnier, but incomparably more complex and technically 
intricate post-Gothic world, shielded from a curious trespasser by large and unpublished 
computer codes where, by definition, nothing is either minimalist or made of atoms. 

And yet the minimalist Inferno left a definite trace in my chemically pre- 
complexificated mind: there must be some deep {ergo, simple) truth in the entire 
approach. 

The “naturalization” of atoms in chemistry and, later, physics, as well as genes in 
biology, was a result of processing numerical experimental data. It is the absence of 
numbers that makes the generative linguistics a nightmare for the natural scientist. 

Nevertheless, knowing well that mathematics is not only about numbers, and 
responding as a chemist to the teasing call of my “sister science,” I see more than just 
vague parallels between formal linguistics and chemistry. I have a feeling that Noam 
Chomsky saw and tried to solve a really monumental scientific problem much ahead of 
time. The right time, probably, has not yet arrived: the fifty year old artificial intelligence, 
while beating the grand masters at the chess, still generates ridiculous translations. I feel 
that Chomsky’s appeal to chemical experience is justified. 

It is true that the chemists went through a similar stage. The chemical concept of 
atom, as speculative as genes before molecular biology, had been introduced by John 
Dalton around 1808, but it took exactly one hundred years before Rutherford could put 
some physical flesh on the surreal invention of the mind. After that, about fifty years of 
theoretical maturation followed. For the remaining half century, chemists have quietly 
immersed into applications and their monetary rewards, now even sending ripples 
through the stock market with the dotcomoid nanotech. 

There is another, more successful than AI, area with a similar childhood: the 
chemistry-based molecular biology. Formal geneticists were able to map the distribution 
of genes in chromosomes at the dawn of the twentieth century without a slightest idea of 
what either gene or chromosome actually was, simply by comparing the observable 
features of progeny with those of the parents and counting some numbers. The 
Chomskian linguistics is strikingly similar to the pre-DNA formal genetics (now called 

transmission genetics), in which an enthusiast can even find parallels between iAJlobf and 



20 


idlerge of linguistics and the crossing over and linkage of transmission genetics. The 
dramatic (and probably lethal) difference is that formal linguistics is based on an idealized 
standard of language, which we (not having blood royalty in America) can never hear around. 
The formal genetics, on the contrary, was based on the study of observable deviations 
from the idealized standard biological type. This is how the tiny fruit fly worked for 
genetics when the frequencies of mutations were counted. 

After applying the Ockham’s razor—with the barbarism of an ignoramus—to the 
dense growth of formal linguistics, I (together with some linguists) see the main problem 
that fonnal linguistics deals with as the linearization of the non-linear, see Figure 4.1. 

This is like extruding the globe of dough through the spaghetti-making machine, for 
which you need to apply some effort. 

There is the unobservable “message” (my copout, to avoid terms like meaning, content, 
and knowledge) and there is its strictly linear verbal expression. The message is not 
necessarily linear but the expression is. How does the mind do it? For example, how 
could Shakespeare put on paper his mental images of the story of Hamlet, with its 
intricate relations, moves, and physical collisions? 

The simplest part of the 
problem is to portray the 
development of the story in time. 
This is done regardless of the 
language, by arranging 
consecutively the descriptions of 
events. The chain consists of 
phrases. For example, a 
primitive Hamletesque story 
could go like this: 

Enter Ug and Og. Og takes Ug’s meat. Ug ponders whether to kill Og. Ug 
kills Og. Ug dies of spoiled meat. 



Figure 4.1. Linearizing a tree 



21 


The drama of the story is in the unexpected consequences. The linguistic drama 
starts each time when a single link of the diachronic chain must be constructed. There 
could be several alternatives and which one will be realized creates the suspense in the 
mind of the cave Shakespeare. 

Within the fonnal framework, it is useless to ask, as a chemist would, what real 
process stands behind the schema, what its driving force is, how long it would take, how 
it starts, proceeds, and ends, as well as what the word and its boundaries are, and how we 
form the initial array of words. Nevertheless, the core of fonnal linguistics has the power 
of a well-formulated question which is half the solution. 

It is hard to miss the similarity of the problem of linearization to the problem of 
projection in our perception of visual images: the complex 3D object turns into a flat 
image on the eye retina. For the humans who, like greyhounds, get most of information 
from vision, language seems to be an extension of vision in the sense that the same kind 
of topological many-to-one mapping has to be perfonned. Language can be called a 
Collective vision. (What is TV? Corporate vision?). 

I absolutely do not intend to criticize the fonnal grammar. Atoms were criticized, 
too, and, for that matter, what sound idea was not? Just the opposite, I would like to 
formulate the points of the fonnal theory that not only make sense but also appeal to a 
chemist who has survived i§>uperprojectton and i&ublabel. 

1. Binarity. The basic relation between words/atoms is binary in X-bar and 
chemical bond. Not only there is an attraction between some words, but there is also a 
bond between groups of words, which tend to stay within the group, as a swann of 
midgets (subjacency). 

2. Linearity. The output of a certain process is linear. This means that a certain 
typically non-linear structure undergoes a structural transformation, as in a chemical 
reaction. 

3. Optimality. The natural direction of the process is toward a minimization of a 
certain parameter, as in chemical reaction going toward in the state of equilibrium. This 
point is not exactly a part of the fonnal framework but it snugly fits the picture and is 
embraced by Chomsky. 



22 


On this positive note, let us now take a peek into the world of Joseph Greenberg 
who was also attacked and vilified in his time and for the same reason as Noam Chomsky 
(and may 1 add Charles Darwin?): his conclusions about evolution of languages were beyond a 
direct proof. After the sulfurous inferno, however, the world of Joseph Greenberg brings 
the revitalizing smell of the rhubarb pie to the nostrils of the natural scientist who has just 
been zenned by Ug, sorry, UG (Universal Grammar). 

In the Semitic languages the verb tenses are marked by varying vowels, usually, 
around three constant letters of the root, as in Hebrew (read Hebrew characters right to 
left): 

nmn ’ax nnD ’ax nnrr 

ani kotev ani katav ani iktov 

I write I wrote I will write 

The root of the verb is ktv, HDD. The additional ambivalent letters , and 1 are 
sufficient to distinguish between tenses in writing, without explicitly indicating how the 
words are pronounced, which could be done by diacritic signs. The signs, absent in the 
examples, are written in Hebrew and Arabic only when absolutely necessary. The last 
vowel of the verb between D and D is not hinted in any way. The heard but invisible 
vowels, therefore, define here the tense, which in writing is marked by adding some 
visible but not heard letters from a limited set. 

Figure 4.2 presents the top left hand comer of Table 1 from Greenberg (1990, 
p.368). The table contains the frequencies of binary occurrences of the first and second 
consonants of three-letter roots of 3775 Arabic verbs. The work was stimulated by the 
previously well known fact that Semitic roots do not typically contain identical first and 
second consonant, while no such restriction exists regarding the second and third ones. 
Greenberg presents two more tables with correlations of II-III and I-III consonants, which 
exhausts the matter. 

The numbers in the Figure 4.2 are quantitative measures of occurrences of initial 
pairs of root consonants and they can be converted into probabilities, although the list of 
all verbs is not a natural system for a natural scientist. It is like a list of all animals in a 



23 


forest without their actual numbers. Nevertheless, some conclusions can be drawn from 
the frequency of a combination of a long neck with long legs or hooves with horns. 

What the Table shows is a Shakeaspearean play of affinity and animosity between 
consonants. The two identical initial consonants categorically refuse to stand side by side, 
as the arrow along the diagonal zeros indicates, but there are also definite areas of affinity, 
as around consonante n (No. 12). It could be seen from another Greenberg’s table that 
the last two consonants do not mind being twins. 



Figure 4.2 A corner of one of Greenberg’s tables 

Analysis of this kind was one of the first exercises of the newly mastered power 
of computers (Shannon, 1948). Shannon measured the frequencies of letter and word 



24 


pairs and synthesized artificial texts. Greenberg, however, did his work entirely by hand, 
as, by the way, Zipf did his. The frequency matrices of the type explored by Greenberg 
and Shannon are linguistic fingerprints. Like DNA analysis, they can be used for 
recognizing kinship between natural languages as well as styles of individual writers. 
Among other applications of frequency matrices are the fingerprints of musical styles. 

As a non-linguist, I will use the single old paper of Joseph Greenberg, published 
in 1950, the remarkable time with all the planets in the most auspicious layout for great 
ideas and the computers in the prenatal fetal position, to illustrate a principle important 
for both chemistry and linguistics. Scores of modern works are running off the computer 
mill today. 

Shannon’s game of generating texts from the probabilities of letter and word pairs 
still fascinates the public and specialists. The reader who copies the next paragraph and 
runs it through Shannonizer (Senyak, WWW) will easily see why. 

Greenberg’s tables are something a chemist can fully identify with: the numbers 
characterise the strength of the bond between morphological atoms, or, metaphorically 
speaking, of their attraction to each other. In everyday parlance this is what we mean by 
chemistry—good or bad—of human relations. This is also what we mean by chemistry 
proper, in which only aggregates of atoms with good attraction are stable and those with 
repulsion or indifference are ephemeral or non-existent. 

The high school chemistry is almost exclusively “good” one. Only when a student 
goes to a college, he or she discovers that modem chemistry started with the recognition 
and intense study of structures with bad internal chemistry. They are short-living, rarefy 
observable, capricious, unruly, and cannot be kept in a jar on the shelf, but chemical 
transformations go through such ghostly chimeras and cannot be explained without them. 

This may be an entirely poetic vision, but I associate the transition states, as the 
chimeras are called, with fleeting thoughts. Indeed, what is captured and fixed into a 
stable grammatic fonn as a generic “thought,” like “All people are bom equal,” is only 
the stable result of the thought process. I suspect that thouhts are unobservable for the 
same reason most chemical transition states are: they are, as the old chemists used to say, 
in statu nascendi, in the moment of birth. 



25 


If thought is a transition state and the verbal output is the final state, the 
inavoidable question is what the initial state of thought is. Questions of this kind, 
comprising the entire triad source—thought—language, have been generating the 
Himalayas of literature through millennia. I would prefer to leave the majectic 
background behind the merciful fog and present the larger picture as a few snapshots 
taken through a chemical lens. As language is a linearization of an image and image is a projection of 
an object, my notes are only a projection of both into the mind of a chemist, extruded as a text output, not 
without the help of a computer. 

The main para-linguistic problem that arises in connection with matrices and 
tables is how to send the entire Greenberg table over the telephone. Same problem exists 
in chemistry: how to send a structure over the telephone? Without the Internet, of course. 



26 


5. CHINESE AND CHEMICALESE 


I believe that an organic chemist thinks not in words but in structures. They form 
the sustaining environment for the chemist, like the animals for the hunter or the plants 
for the gatherer on the verge of a protolanguage. I definitely see structural formulas or 
their fragments when I think about familiar compounds. They are the source of my 
thoughts about them and they can be shared. 

Some of chemical thoughts are visual images of relatively simple structures, like 
benzene, while others are more complex but seen in all details, especially if the chemist 
works with them daily. Chemists can mentally operate the structures, similarly to the 
blindfolded chess player who can remember and operate positions on the board. For 
chemists, the relation between the object and its representation is exceptionally 
straightforward, iconic (Peirce, 1992 ). 

The development of a separate language in chemistry by the end of the nineteenth 
century, when the first comprehensive handbook appeared, was driven not so much by 
the need to communicate with other chemists, which could be better done on paper or 
blackboard, but, as I believe, from the need to store and search the stressfully 
mushrooming literature. A similar problem arose millennia ago in China. 

The Chinese characters (Kangxi in China and Kami in Japan) consist of standard 
elements arranged in a certain order. They come from ancient pictograms, later combined 
into ideograms and supplied by phonetic elements. 



27 


Radicals 


B 

— 

E 

■ 

j 

Z 

« 

L 

|iia _ 

— 

A 

JL 

A 

A 

n 

1“■ 

> 

ft 

u 

73 

A 

n 

H 

a 

c 

ft 

I’i 

p 

r 

L 

X 

m 

a 

□ 

± 

± 

a. 


9 

A 

A 

ft 


ft 

''J"' 

% 

F 

ft 

ill 

JH 

T 

a 

ft 

ft 


r 

A 

ft 

A 


It 

i 


Hi 

A 

* 

F 

ft 

x: 

£ 

ft 

ft 

n- 

A 


B 

3 

n 

A 

ft 

Ill 

w 

& 

# 

bb 

% 

& 

A 

ft 

A 

/it 

X 

£ 

JH 

n 

ft 

ft 

A 

m 

£ 

3. 

ft 

K 

If 

ae 

ffi 

EH 

ft 

r 

3*4 

e 

m. 

m 

m 

ft 

ft 

a 


A 

St 

A 

A 

s 

F 

ft 

ft 

ft 

R 

m 

m 

ft 

ffS 

ft 

¥ 

m 

m 

S 

m 

s 

a 

a 


ft 

& 

fe 

n 

5 

a 

tht 

F 

A 


B 

a 

n 

n 

ft 

R 

w. 

AT 

M 

ft 

a 

J®. 

M 

ft 

ft 

m 

& 

a 

S 

m 

!± 

9fl 

# 

fe 

pi 

ft 


m 

m 

it 

m 

# 

m 

m 

■fch 


ii 

ft 

M 

si 


it 

IT 

ft 

g 


# 

rSi 

u 

PI 

t 

m 

% 

a 

■a 


ei 

& 


m 

E5 

■r 



-I4V 

PR 

m 

m 

f 

m 

m. 

EE 



B 


ES 


m 

E33 

PH? 





Figure 5.1 Chinese bushou 


not to mention the grammar. 


The problem with pictograms and ideograms 
is that there is no preferred or obvious way to order a 
very large number of them. In China the stress that 
such situation inflicted on the business of making 
dictionaries was resolved around 140 AD by using 
components called radicals ( bushou ) as “quasi- 
alphabetic” and partially semantic markers, while 
arranging them in order of increasing number of 
brush strokes (a combination of right and down 
strokes is considered a single stroke). Figure 5.1 
(from Zhongwen, WWW) shows the Chinese 
bushou counterpart of alphabet. The white-on-black 
characters are Chinese numerals from 1 to 17 
signifying the number of strokes. The modem 
system of 214 bushou radicals was introduced in the 
seventeenth century. The Japanese writing system 
uses the Chinese characters, limited in number by 
the script reform of 1946 to 1850 characters (now 
1942), but is significantly different as a whole 
because of a separate indigenous phonetic system, 


Figure 5.2 shows a smaller part of all ideograms with the radical A 
ren, person (top left-hand comer), which is used also as a whole character. As a radical, 
it takes different forms depending on left or top position. The arrows point to: 1. Feelings 
(two + person). 2. Partner (person + fire, i.e., at the same fire). 3. Position (person + 
standing). 

The chemical nomenclature, amazingly, uses the same principles of dividing the 
structures into classes by semantic features, and arranging them within the class by their 
components and size (IUPAC, 1993). Examples are the radical “person” in Chinese and 
the class “heterocycle” in chemistry, i.e., cycles built with participation of atoms other 


28 


than carbon. If there is more than one cycle, the compounds are ordered in search systems 
according to the number of cycles, their size, number of non-carbon atoms, and their 
types, which is all much more complex than the Chinese characters. 


ABff 
tf ft 

til [ffii 

tsts 


-$■ 



/ 

KM* 

ffit£ 

## 


1 


~7 


tft 
Id till 
ft-ft 


iV 


tJ3 

UM IS 
Wt* 
ttit tea 
-fete 


-S-E 


Figure 5.2. Some Chinese characters with the 
radical ren. Inverted characters indicate the 
number of additional strokes. Explanation of 
arrows, see text. 


such as permutation, flipping, stretching, rotation, 
topology of the molecule. It can be represented by 


Like Kangxi, 
chemistry uses numerous 
abbreviations and shortcuts. 
2 Now and then, both 
Chinese characters and 
nomenclature are 
ambiguous or just difficult 
for a novice. For a curious 
example, see Appendix, 1. 

The chemical 
notation allows multiple 
versions of portraying the 
same structure, which can 
be transformed into each 
other by simple operations, 
etc. What remains invariant is the 
a molecular matrix, for example, in the 
case of formaldehyde with 
indexed hydrogen atoms, Figure 


H! 



H 2 


Hi 

H 2 

C 

O 


Hi H 2 C O 
0 0 10 
0 0 10 
110 2 
0 0 2 0 


Figure 5.3 Formaldehyde and its molecular matrix 


5.3. 

Since the hydrogen atoms 
in formaldehyde are 
indistinguishable, the indexes are 
redundant, which is generally not 
the case, for example, in methanol 
with two types of hydrogen atoms, 
Figure 5.4. 



29 


H 


H 


H 

H 

H 

H'C 

0 

H 

0 

0 

0 

0 1 

0 


H 

0 

0 

0 

0 1 

0 

C O M 

H 

0 

0 

0 

0 1 

0 


H' 

0 

0 

0 

0 0 

1 

H 

C 

1 

1 

1 

1 0 

1 

0 

0 

0 

0 

1 1 

0 


Figure 5.4 Methanol and its matrix 


The nonlinear pictographic 
and ideographic character of Chinese 
and chemical nomenclature separates 
the sound from the sign. Both systems 
of language make the message 
visually understandable, respectively, 
for all chemists regardless of language 
and all literate Chinese speakers regardless of very significant dialect differences. The 
spoken foreign Chemicalese, however, can be incomprehensible even for a chemist. Thus, 
in Russian, “benzin” means gasoline, while benzene is called “benzol” (from German). 

The musical notation can also be viewed as a quasi-hieroglyphic system where the 
signs represent the sound and its duration by separate means. But the notes clearly 
display a property that is much less pronounced in other languages: music, like natural 
speech in real circumstances, is mostly continuous and it consists of a hierarchy of 
overlapping segments, including very large ones (unless it is minimalist). For music as 
language, see Jackendoff (1987, Chapter 11). 

The matrix representation of chemical structure is not something that could be 
found in a textbook: the chemists do not need it, unless they are developing chemical 
software. The following is yet another representation of the structure of formaldehyde, I 
believe, not to be found anywhere, although the principle is known in programming as 
sparse matrix. 


C—Hj , C—H 2 , C==Q 


This representation lists the pairs of connected atoms in no particular order and 
contains the same information as structural and matrix representations. All three can be 
reconstructed from each other. 

The list (sparse matrix) for methanol is: 3 (C—H'), C—O , O—H'. 

What neither the matrix nor the list have is iconicity (Peirce, 1992 ), a similarity 
to the molecule itself. Note that similarity is a cardinal notion in Pattern Theory and can 
be formulated in exact terms. What the list has, however, is linearity and spontaneity. The 




30 


latter means a relatively high entropy and low effort resulting from a minimum of rules 
required to assemble the list. The list of binary connections is, paradoxically, a random 
way to represent order. From a sufficiently long random list with repetitions and reversals 
of the doublets (C— Hi <-» Hi —C ) a complete representation can be easily derived 
by eliminating redundancy. Thus, the list like this: 

C—Hi , Hi—C , C—H 2 , C=0, Hi—C , 0=C , C—Hj, C=0 !!! 

reminds a fresh on the spot account of a witness of a horrible accident to a policemen. 
The agitated witness is repeating and varying the same phrases while the policeman is 
jotting some concise notes down. The excitement of the witness creates an internal noise 
and repetition is a patented way to get the information through a noisy channel. 

Our cave-dwelling ancestors are unreasonably portrayed in the movies as 
unkempt and dirty, which can hardly be seen even among monkeys, but we definitely 
could not expect from them the eloquence of Demosthenes. The eloquence of a survivor 
of a plane crash is more probable— the crash of old good animal life. 

The most important thing for us is that a highly randomized linear sequence of 
pairs is sufficient to code a non-linear chemical formula. To preserve the binary relations 
in whatsoever order is all that is needed. To think in Pavlovian binary relations 
“ringing”—“food” is something any mammal is good at. 

OH The language of chemical nomenclature 

linearizes the formula of the substance on the left by 
deriving its name from the longest six-carbon chain, 
here hexan, C(, . The atoms of the chain are 
numbered and the groups that substitute for atoms of 
hydrogen are listed by their “morphemes” methyl 
and ol in a prescribed order along with their 
positions. The components of carbon chains, such as methyl, are called radicals, while - 
ol is an example of a function (hydroxyl). It is more complicated if the substance 


ch 2 ch ch, 


h 3 c 



31 


contains cycles and non-carbon atoms in long chains. Chemical nomenclature widely 
uses parentheses and brackets as syntactic means of grouping, nesting, and long-range 
connections. 

Both written Chinese sentence and chemical name are perfectly linear in 
appearance. There is as little similarity between a structure and its chemical name as 
between a live person and the sound ren that signifies it, but the structure can be 
reconstructed from the name. The chemical name is a linearized molecular matrix. 

The Chemicalese does it by using the complicated grammar of chemical nomenclature. 
This is the simplest universal grammar, truly minimalist, and seen in all its nakedness. 
Still, it does not say anything about protolanguage. 

In spite of my declaration that chemists think in structures, the chemical thought 
is not observable, either. A chemist may associate with the word “formaldehyde” its 
gross fonnula CH 2 0, structural formula, the image of the bottle, or the smell, depending 
on context and circumstances. In a discourse, ambiguity, never completely suppressed, is 
minimal or easily resolvable by a question. 

The chemist has all the reasons to believe that the chemical fonnula is not just a 
symbolic representation of the molecule but its model or a projection of a model that 
preserves important properties of the original, namely, its topology and in some cases its 
metrics. What the chemist keeps in his mind—and the fonnal linguist brushes off—are 
the physical properties of atoms and “atoms” and bonds between them. 

To summarize, the function of language, so clearly demonstrated by chemical 
nomenclature, is first and foremost linearization. It turns the source (configuration, event, 
situation) into a string. In the rather rare chemical case, the source, most probably (but 
you never know), differs very little from its invisible mental representation. 

The natural for a chemist idea that the haphazard list of bonds as relates to the 
rules of chemical nomenclature as protolanguage relates to language is the central 
proposition of this paper. There have been currents in linguistics, as well as in the 
collective painting of a larger picture, flowing in the same direction and branching into a 
wide delta. 



32 


Atoms are real, so are Og, Ug, cave, and a piece of meat, bur what is verb? Can 
we put a chemical finger on give or take? 



33 


6. RENE THOM AND IMAGES OF CHANGE 


The approach to the language through the main entrance from the Broadway of reality 
counts solid and eloquent literature, including Charles Sanders Peirce (1839-1914; Peirce, 
1992), Lakoff and Johnson (1980), Johnson (1987), Harnad (1990), Cangelosi et al in 
Cangelosi and Parisi (2002). Fodor (1976) developed an essentially structuralist approach 
to thought without the interface with observability, which is legitimate, but his Mentalese 
is not spoken here because it is not audible. 

The entire topic of the interface between reality and the mind is so heavily 
burdened by centuries of philosophic discussions, starting from Plato’s cave, that it is 
best not be touched by a chemist. But the topic is alive and exciting, especially, in view 
of some recent postmodernist attempts—uncharacteristically serious—to unite sciences 
and humanities. See for example, De Landa (1998). 

The opportunity to bridge the “embodiment” linguistics, generative linguistics, 
and natural sciences was missed when the Rene Thom published his Structural Stability 
and Morphogenesis in 1972 (Thom, 1975). Her refers (Thom, 1975, p. 116) to 



34 


generative grammar and draws a parallel between linguistics and biology, both having a 
hierarchy of morphological levels. 



Figure 6.1 Archetypal morphologies of Rene Thom 






























35 


The name of Rene Thom (1923-2002) is practically absent from linguistic 
literature. He was a French mathematician who presented a very general theory and 
typology of sharp discontinuous changes in the evolution of forms. His book was 
enthusiastically received but rather quickly set aside, probably, as Thom noted, because it 
did not lead to anything calculable. Besides, it was sometimes easier to see his intent than 
to follow his thought. Sadly, Thom is fading from scientific memory even faster than Zipf 
who remains at least anchored there by Zipf law and his “eccentricity.” 

David Lightfoot comes closer than anybody to the concept of transition state in 
linguistics. He mentions Thom in his detailed and hands-on linguistic book (Lightfoot, 
1999), which is unique in its thorough attention to the science of complexity, the 
exactness of the large picture, and the metaphoric power. 

Figure 6.1 illustrates Thom’s typology of “catastrophic,” i.e., abrupt changes. 

Note, that almost each of them carries a name of a verb, including give. In my chemical 
interpretation, Thom distinguishes between three stages: stable and prolonged initial and 
final states, not pinpointed on the time axis, and a short transition between them. 

Thom’s types may seem nothing but modem ideograms that could be used in an 
artificial script similar to Chinese. The cave pictures of the hunt, where some animals are 
chased and other are lying dead or wounded, are, probably, the first pictograms of verbs 
right at the fork road where language is about to split from art. 

The minimalist ideograms of Thom can be compared with some of those 
suggested by linguists, Figure 6.2. 

The image schemata of Mark Johnson and George Lakoff (Lakoff and Johnson, 
1980, Johnson, 1987) are less abstract than the archetypes of Thom. While the latter 
clearly distinguish between the tree diachronic stages of any change, the former appear 
to be just ideograms, like the Chinese characters in Figure 6.3. 

Thom’s “chreods” capture an important property of transition dynamics and 
possess some iconicity, while the static characters are purely symbolic and are 
interchangeable. According to Thom, chreod (“necessary path,” the tenn borrowed from 
Waddington, 1957) is a stable configuration separated from another configuration by a 
catastrophic change. 



36 


Let us put ourselves in the position of the first humans who have few abstract 
ideas. They see a piece of meat in the hands of Ug. After a while it goes to the hands of 
Og. Having witnessed this process, how can we communicate it, for example, as gossip 
(Dunbar, 1996/98)? 

What is the machinery of language trying to accomplish? The system appears to have 
been put together to encode prepositional information—who did what to whom, what is 
true of what, when, where, and why—into a signal that can be conveyed from one person 
to another (Pinker, 2003, p. 27). 



Figure 6.2 Some modern ideograms 

























37 


In the primitive life who does what to whom and what the alternatives are is of an 
utmost importance. 


fir „ 


sing, an ideogram for walk, “step and stop” or “left step and right step.” 

, chu, a pictogram of sprouting plant used as an ideogram for out 
J \ yJ ; XT y 111 ll , attraction with components: mouth, suck, inhale 

m h 

■J 1 bowand string, pull , ^ J 


(left side), hand grabbing a person (right side), 
force. 


tendon. 


Figure 6.3 Ideographic composition in Chinese writing 


Among the enormous variety of attempts to represent graphically what I call 
external sources of subsequent thoughts, semantic networks (Quillian, 1968) have been 
the oldest approach in Artificial Intelligence. The arbitrariness of knowledge 
representation, which reminds to a chemist a relative arbitrariness of writing 2D formulas 
of 3D molecules, is the reason for the never-ending flow of such graphs. Here is another 
example. 

Cornells Hoede (Hoede, 2003) and his group build very attractive knowledge 
graphs as, for example: 


EQU i—| ALI 

PLUTO-H-DOG 


The square, called token, is “something” identified (EQU) as PLUTO and similar 
to DOG: “something like a dog equal to Pluto” 

The number of. binary relation types is limited to eight: 


EQU : Identity 

SUB : Inclusional part-ofness 

ALI: Alikeness 

DIS : Disparateness 


CAU : Causality 

ORD : Ordering 

PAR: Attribution 

SKO : Informational dependency 





38 


And yet the simple “Mary and Mike married” requires four tokens and nine 
relations for representation, Figure 6.4. 



Figure 6.4 Example of knowledge representation. From Hoede (2003) 

Sowa (2000) provides an excellent guide in this world created by the drive to 
overcome the suffocating linearity of our language. We see in the postmodern art a 
similar drive to overcome, in the form of an installation, the Euclidian restrictions on the 
tangible classical form and its synchronicity. On the contrary, language itself started with 
overcoming the overwhelming connectedness of the world and passing it through the 
bottleneck of speech in segments. 

The obsession with graphs, shared by chemists, has an underlying agenda in 
linguistics: to penetrate into the structure of thought. If we take up the vow of abstention 
from painting the images of thought, the graphs like those in this chapter fall in the 
category of images, which is a term of Pattern Theory. Image is the observable 
configuration generated by the object in the sensory anteroom of an organism, robot, or 
by an imaging instrument. The actual configuration of the object, which I call here the 
source, can be to some extent reconstructed from images, as it is done in CAT scans, 
Mars rovers, or the intelligence reports based on numerous and only partly reliable 
sources. The chemists do it from instrumental and analytical data. 

Speaking half-metaphorically, if it is about medicine, in the image area between 
the source and the thought, the skeleton of the source is cleaned from flesh and prepared 
to be taken apart and arranged as a lineup of bones for transportation, as an unearthed 
skeleton of a dinosaur, but this is all we can say about it. Thus, a series of CAT scan 






39 


images results in the verbal linear description and diagnosis (intelligence report), with 
most of the computerized flesh retired to the archive or discarded. 

The process of protolanguage generation, from the point of view of a chemist, 
displays between the external source and the output, passing the stage of image 
(projection of the source), thought (invisible), and speech (observable): 

Source -> image -> thought output 

The mathematical language of Pattern Theory is the lingua franca for all three 
stages. PT also offers its own approach to configurations in the mind. 

The General Mind Model (GOLEM) was suggested by Ulf Grenander (2003) as a 
natural application of Pattern Theory (Grenander 1976-1995) to systems of high 
complexity 

The internal output of GOLEM, which is a state of GOLEM’s mind, is a 
configuration called idea. It consists of atomic generators connected in a certain order, is 
limited in size, and is preceded and followed in time by other ideas. The idea can be 
spontaneous or induced by an external stimulus. It is selected along a probability 
distribution for generators and their connections. 



Figure 6.5 . Configuration of an idea, along Grenander (2003), modified 








40 


Figure 6.5, modified from Grenander (2003) shows a typical idea of GOLEM in a 
matrix form. It is a list of all generators in the content of the idea, together with the 
connector graph. We can expect, that in a further evolution of GOLEM, this 
representation will persist. 

The nature of generators here is irrelevant. They can be physical objects, ideas, 
emotions, memories, instincts, and all atomic components of a state of the system, in 
accordance with the compositional principle of PT. In linguistics, the generators are 
words or their roots, no other question asked. 

The probability distributions Q and A, from which the idea is stochastically 
selected, are calculated from the previous Q and A , as well as from the external input, 
and are subjected to dissipation in the form of forgetting. GOLEM’s mind, therefore, is a 
thermodynamically open system. While Q gives the probabilities of generators for 
content selection, A gives the acceptor functions (affinities of generators toward each 
other) from which the probabilities of bond couples in the connector are calculated. 

GOLEM’s idea, therefore, is a quartet <CONTENT, CONNECTOR, A, Q>. In 
the case of mental activity, probability distribution Q , in very general terms, can be 
compared with the priorities of an agenda: some items are more urgent than others. To 
use another metaphor, the state of mind is a pandemonium of generators, most of which 
are silent, and some of those few that are heard are louder than others. Accordingly, A 
reflects the strength of the bonds between generators and, therefore, the relevance and 
consistency of the content. 

In the current Version 3.1 of GOLEM the output is a graph. Its internal precursor 
is connector in the form of a sparse matrix of the connector graph, which is essentially 
the list of bond couples between the generators selected into the content. It reflects the 
content and topology of the idea but its interpretation can be confusing for generators of 
high arity, as well as for fragmentary ideas. This confusion, however, is an intrinsic 
property of thought before it has been verbalized. While all the other known to me 
graphic approaches to thought aim at expurgating any ambiguity, Grenander’s GOLEM 
proudly carries the poster: TO ERR IS HUMAN. 



41 


7. CON Fill RATIO NS, PATTERNS, AND NEAN 


The key to the evolution of protolanguage, language, society, and life itself is lost in an 
unobservable past. All we can say is that evolution starts with something extremely 
simple and accumulates complexity by simple steps. 

In this chapter, pursuing the large picture, I am going to pass over the 
insurmountable stack of literature on evolution. I will firmly hold onto my chemical 
intuition and experience, as well as the major ideas of Pattern Theory, which for me has 
been a meta-chemistry. Along the way, the simplest and a simple, but not the simplest, 
mini-grammars for protolanguage will be described in a few lines. 

Pattern Theory (PT) is a branch of mathematics that embodies a very simple and 
known since Democritus and his proponent Lucretius (1958) principle of generating 
complexity: complex structures are produced by combining simple elements. 

Within the framework of PT (Grenander 1976-1995), both molecules and 
expressions, whether verbal, written, or pictorial, are configurations made of atom-like 
primitives, called generators, with potential bonds capable of locking into actual bond 
couples. Each generator possesses a certain bond structure (number, label, orientation, 
sometimes, spatial order, and numerical properties) which in chemistry is usually 
associated with valences. 

Since all differences between complex objects can be expressed in tenns of 
elementary blocks and a particular order of their connections, PT is an incomplete but 
extremely general and powerful view of the world, especially outside the domain of 



42 


physics. It is a platform on which a more complete view of a great diversity of discrete 
and discontinuous objects can be built by adding other mathematical tools. The 
generative grammar, in which expressions are various combinations of various blocks, 
falls into the application basket of PT together with chemistry and countless other areas 



Generators A and B 



Figure 7.1 Generators and configurations 

As an example, two generators from a global generator space G are identified as 
A and B in Figure 7.1. They have bond structures characterized by local bond coordinates 
a and b indexed both individually and by their generator. For example, bonds of 
generator B are indexed as bi and l~b. Bond values p (numerical, Boolean, or strings of 
characters) are attributed to the bonds. Given the generator space and a bond value 
relation p, a configuration space can be defined so that for each pair of bonds with 

values P and P', bond value relation p is either TRUE—and bonding is allowed—or 
FALSE—and bonding is forbidden. A configuration built “by the rules” is regular. 



43 


Appendix 2 presents some large real-life configurations that can make you 
speechless. 


The realism and power of PT comes from its flexibility. All the rules can be 
relaxed or made stricter. Thus, the bond value relation p can be characterized by a real 
number between 0 and 1, which is neither TRUE nor FALSE, but just a probability of 
the bond. For example, the Greenberg tables—the example I will exploit ad nauseam — 
describe a space of regular bonds between the letters of the Arabic roots in various 
positions. They also fuzzily divide the entire space of possible letter doublets into regular 
and irregular. Thus, two identical letters in positions I and II are highly irregular. 

The regular Arabic roots are triplets of consonants. This pattern is defined on the 
following generator space (w2i and w2 2 are bond coordinates of generator W2): 





The regular configurations are those for which p = TRUE if p = p', which gives 
the single possible configuration W1-W2-W3: 



This configuration is, in fact, a pattern, i.e., a class of configurations that can be 
obtained one from another through the same similarity transformation, in this case, 
permutation of consonants. It means that Wl, W2, and W3 are not individual 
consonants, but the entire consonant alphabets. To push this idea to its limits, the 
combinatorial machine that Gulliver saw in Laputa, is nothing but a pattern of all possible 
strings of text, generated by permutation of letters. 

This property of combinatorial systems is characterized in practically all general 
courses of linguistics as combinatorial infinity. In PT—as well as in chemistry—the 
“infinity” could be drastically cut by the properties of generators, as it is the case in 





44 


Semitic verb roots. Out of 29 3 = 243 89 possible triplets, only 3775 roots (15.5%) were 
listed in the source dictionary. The probabilities of the consonant pairs significantly differ. 

The Greenberg tables describe an artificial object: the list of all Arabic verb roots. 
They have nothing to do with any actual conversation or text, in which only a small part 
of them could be used. The power of the approach is that the statistics of the list of verb 
roots is similar to a DNA analysis: it can be used for the kinship analysis between Semitic 
languages and dialects, both synchronically and diachronically. 

From letters we switch to words, where a peculiar situation arises. Two-word 
generators (W1—W2) and (W2—W3) offer a natural choice of bond values identical 
with the constituent words. For bond relation p = TRUE if p = p' , the following 
doublet of composite generators is regular: 


It directly translates into W1—W2—W2—W3, which turns into a triplet if the 
rule of haplology is applied. Haplology was in the focus of Zipf s theory of the least 
effort. 

Witness the phenomenon called haplology; when two similar syllables—they need not be 
identical—are adjacent, one may become permanently truncated. (Zipf, 1965, p.85) 
Abbreviation is then actually a short-cut; and moreover, since the stream of speech knows 
no other arrangement than that of time, an abbreviation of speech is a short-cut in time 
(ibid., p 284). 

Haplology—the word itself is ripe for haplology—results in two identical 
neighbors rewritten as one: 

Wl—W2—W2—W3 Wl—W2—W3 

If we are not in the mood for hairsplitting, this single rewriting rule already 
constitutes the smallest transfonnational mini-grammar. Unlike a grammar of a full 



45 


language, this grammar is strictly local and does not require any extended memory for 
generating an output. The speaker does not need to keep a large part of the entire image 
of the source in mind to express the thought, as it might be required for using a Gennan 
verb with a separable prefix or for expressing a complex thought in any language. Only a 
recognition of two adjacent generators as different or identical is required, which is an 
effortless task for human mind and, probably, a foundation of all animal intelligence. 

Clearly, there is an even simpler grammar, used by animals, in which the 
configuration is a single isolated generator, i.e., a sign, but this is no grammar at all, but a 
naked lexicon. 

The first step in the evolution of grammar is the language with utterances 
connecting two generators. 

A slogan in knowledge graph theory is that “Thinking is linking somethings” (57c!). 

Hoede (2003). 

I call this language Nean, as a tribute to our cave-dwelling ancestors. I do not 
mean that the Neanderthals really spoke it. 

To start with a binary relation is the most natural thing for a mathematician. The 
idea that protolanguage starts with a pair of connected words was convincingly discussed 
by mathematician Keith Devlin in his remarkable book (Devlin, 2000). He even gave a 
formal Chomskian structure for it as an elementary tree with unlabeled nodes (p. 170): 



This idea is completely in line with the general concept of protolanguage by 
Bickerton (1990, 1995, 2003), who had created the entire area, previously avoided by 
linguists, and whose work had an impact on Devlin (as well as on myself, of course). On 
my part, starting with Bickerton’s idea, I am also strongly influenced by Pattern Theory 
and the concept of configuration in which both generators and bond couples contribute to 
stability. What I consider the main product of this influence is the assumption that Nean 



46 


is capable of expressing much more than a simple link: it expresses a source up to a 
significant complexity. 

Nean directly translates binary links in the source into doublets in the utterance. 
Nean looks like a sequence of doublets. Of course, it can be peppered with singlets. A 
conversation on Nean looks like: 

“ab aba bd ad d d ad !” 

“d d d ab c ab c!” 

Or: 

“Ug big big big Ug big Ug .” 

“Og Og hungry Og hungry hungry” 

Nean may require some phonological means to mark word delimiters and stops, 
but the Ug-Og dialogue shows that it is not absolutely necessary. The change of content 
or a natural pause caused by an external event would do. 

I claim, without any proof, that this is where the human language starts. A dog 
can say “I am hungry” or “I want to go for a walk” by whining at the table or sitting at 
the door because the dog’s “ I ” is clear from the context. Probably, literature on animal 
communication can provide more complex messages, but I am not familiar with it. 

On the example of this primitive grammar we can see the distinction between the 
formal grammar, in which an unlimited number of generators can be combined in a 
doublet, and the actual utterance generation, in which the output is dictated by the source. 
For a formal linguist, any doublet is as good as any other, while for the linguist who 
looks at the language generation as natural process, only the doublet that preserves the 
connectivity of the source is good, and the one that does it best is the best. We have a 
parameter for comparison—called fitness in Artificial Life—which is a condition of 
“naturalness” in terms of thennodynamics as well as selection. 

To make the next step and launch the evolution of grammar, we take a random 
sequence of doublets and compress (“zipf’) it, using the following set of rewriting rules: 



47 


ab + bd -> abd 

ab + ad -> abd OR adb 

ac + be -> abc OR bac 

This “haploid Nean” is an example of a mini-grammar of a somewhat larger size. 

Given the list of generators from generators space G, the probability of a 
configuration is defined by the probability of its bond couples. Instead of products of 
probabilities, sums of logarithms of probabilities (“energies”) can be used, which makes a 
configuration additive regarding the “energies” of its bond couples, as it is typical for 
chemistry. The “energy” is a quantitative measure of what can be approximately 
characterized as the strength of the bond couple or the mutual affinity of two generators. 

A series of mathematical and practical problems arise when we apply the concepts 
of probabilities and energies to non-physical systems such as language. I am not qualified 
to analyze this aspect of theory and can only outline the essence of the difficulty as well 
as a way out. 

To use probabilities, we need a complete system of possible outcomes in a 
dynamic system, as well as a clear criterion of which outcomes are independent. This is a 
very tall order. 

Dynamic systems are ensembles of large number of entities that randomly 
exchange a certain additive (conservative) value, statistically distributed over the entities. 
In general, probability and energy are convertible currencies in closed (i.e., non-existent 
in reality) dynamic systems: the lower the energy of the state of the system, the higher its 
probability. The lower the overall energy, the more stable the configuration. 

This relation, completely transparent in statistical mechanics, which deals with 
idealized closed systems, is not so clear when we deal with real complex systems. 

To use energy, we need a scale with a zero point (or the so-called partition 
function in statistical mechanics). We meet neither condition in the chemistry of even 
slightly complex molecules. When we discuss evolving systems of life and Artificial Life 
(ALife), the physical energy becomes meaningless. Instead, entropy multiplied by 



48 


statistical temperature could be used—if only we had really well-defined dynamical 
systems, which we have not. 

The problem described here is very general. Probability theory is one of a few 
areas of mathematics where we can find debating sides (another is—no wonder—the 
foundations of mathematics). Bayesian theory , which is the subject of many sites on the 
Web and can be found in any, even low-level, course of probabilities, points to the way 
out of the incompleteness of our knowledge about real systems, which we can 
complement by additional observable data. An avalanche of works has been triggered by 
the community of the worshippers of Reverend Thomas Bayes (1702-1761), but it safely 
passed over the unsuspecting chemists. 

In chemistry, the problem of the incompleteness of the data finds a very simple 
solution. Let us assume that only two structures in equilibrium are of interest, while the 
rest of the system remains the same. In this case the theory gives us not the absolute 
values of probabilities of two states, but their relative probabilities in equilibrium, and 
this is all we need for most practical purposes. The logarithm of the ratio of the 
probabilities is proportional to the difference of their energies. In general, when we do 
not know all possible alternatives, let us take only two. We can tell which one is more 
probable by comparing their energies, provided the rest of the circumstances the same. 
Equilibrium and evolution of a complex natural system, however, are two incompatible 
things. 

From somebody with a chemical frame of mind, like myself, looking outside 
chemistry, Pattern Theory offers a general solution for any process: let us judge 
configurations by their regularity. Irregular configurations are strained, stressed, unstable, 
and short-living. The regular ones are stable, nonnal, uncontroversial, legitimate, and 
dominating. But what is irregular? It is what bends the rules. And what is regular? What 
conforms to the rules. Well, what is the origin of the rules? They are taken from the 
observations. Pattern Theory is an unusual kind of mathematics that restricts the freedom 
of imagination by the observable reality. There is no standard way to design generators, 
some models could be better than others, and a lot of human intuition must be involved in 
a development of an application. One might draw from the contraposition of regular and 
irregular structures an important conclusion that if a stable structure can transform into 



49 


another stable structure, it can do it, by definition, only through an unstable one. This is 
what chemistry is about. Otherwise all relatively stable structures on earth would 
immediately turn into the most stable equilibrium combination of the most stable 
individual structures and time would stop. 

The above peek into PT was extremely limited and superficial. One has to look 
into the original work to appreciate the richness of the subject. One aspect of PT that I 
haven’t even touched upon is the concept of the pattern as a group (in mathematical 
sense) of transformations that define a geometry of the configuration space. The second 
aspect is the idea of template, i.e., a representative, “typical” (Aha! Here comes the mysterious 
“typical”) configuration of the pattern. It could be the most valuable treat for cognitive 
sciences: we all have mental templates for “cat,” “dog,” and “politics.” What are they? The 
centers of gravity of a pattern. 

It will do for my purpose to point to PT as a mathematical higher ground, 
strangely overlooked, for complex systems from which both linguistics and chemistry 
could be seen side by side as affectionate sisters. 

Much more important, the higher ground of PT allows me to represent both the 
source and the output of speech generation in the same philosophically neutral 
mathematical terms: as configurations. 


I will return to Nean in Chapter 12. 



50 


8. SOME RISKY IDEAS ABOUT MATHEMATICS AND LIFE 


I must make here a risky digression—it could be safely skipped by the reader—addressed 
to all those interested in the big picture, including myself. This digression is intended to 
once again draw attention of all researchers of complex systems to PT. I venture to 
express an outsider’s opinion (anyway, mathematics is a language, too) on the 
applicability of mathematics to complex real phenomena. 

The fundamental sine qua non component of any mathematical system is the 
theorem of closure, which requires the set of tenns to be well defined and closed to any 
uninvited visitor during reasoning. It can be traced back to Aristotle and is a condition of 
logical thinking. In other words, mathematics is poorly equipped to deal with the notion 
of novelty. Once the world of a mathematical system, i.e., its terms and axioms, has been 
created, whether in six or 666 days, there could be nothing new under its skies. This is 
definitely not the case under the skies of the planet Earth, as geological, biological, and 
social evolution—and anybody’s personal life—clearly manifest. This is why 
mathematics, as well as physics, is not adroit enough to deal with evolution and the open 
set of terms. Real systems are inherently open to new and unanticipated terms, i.e., new 
generators of PT. 

Mathematics, however, would not be mathematics if it failed to design a 
mathematical system for formalizing something. As it seems to me, the seeds of a 
mathematical theory of novelty were planted in a very little known segment of 
Bourbaki’s monumental Elements of Mathematics about the scale of sets (Bourbaki, 



51 


1968, p.259), to which I would refer the curious reader. It describes how a new set of 
terms is formed from the old one by converting new combinations into old elements for 
the next combinatory step. It seems tricky, but is in fact simpler than it seems. 

As far as Pattern Theory is concerned, its basic sets are remarkably open to 
novelty. PT is just born to be a mathematical tool of evolution because it can construct 
and implant new generators. It does it the same way the scientists express new and 
groundbreaking ideas by using the strings of old letters and inventing some new symbols 
and terms. Moreover, by using the convertible currencies of probability and energy, PT 
stores in its shed all the necessary tools to model the realistic evolution of complex 
systems. Being capable of distinguishing between what is possible and what is more 
probable than something else, PT straddles the fence between the camps of Chomsky and 
Greenberg. 

Energy or quasi-energy, fitness, etc., can be considered a particular case of a more 
general concept of the natural world—stability, which is the other side of regularity. 

In real life some scientists (especially, chemists) deal with differences in energy 
and others (especially, in computer simulation, economics, and cognitive sciences) deal 
with conditional probability because those are the observable source data. 

It must be noted that although chemists freely convert relative energy into 
probabilities (in the form of equilibrium concentrations) and back, they are always aware 
of inherent uncertainty whether all possible outcomes are taken to account. 

This is not the case in the closed system of three-letter roots, where all 
combinations are completely countable. The roots of Arabic are examples of linguistic 
molecules. Greenberg’s results can be reformulated in tenns of energy or affinity, i.e., 
attraction of root letters to each other—yet another reason for chemistry and linguistics to 
listen to the call of the common blood in the lush jungles of complexity. So are the giant 
molecules of musical compositions with the regularity consistently relaxing through the 
last two centuries. This is what, probably, keeps the drumbeat of pop in demand. Similarly, the 
statistics of marriage and divorce, to take a different angle, tell about the strength of 
marital bond in a social molecule and the position of the equilibrium between its two 
basic states of association and dissociation. 



52 


Big ideas have a curious property of staying unnoticed just because of their large 
size. Coming to the end of my digression, I do not want to miss an opportunity to point to 
one such idea. I have nothing to lose by mentioning a remarkable way to circumvent the 
problem of “physicality,” which cuts a deep chasm between physical sciences and 
humanities, as well as between real and simulated systems. I realize that it may seem 
paradoxical and even pervert to both separated sides longing for an embrace. 

Suggested quite casually by Prigogine and expressed in a very general form by 
Rosen (1991, 2001), the idea consists in regarding the physical, closed, and calculable 
world a particular case of a more general world where the vague, large, complex, hard to 
catch and itemize systems, like life, are a more general case than any physical model. 
They do not need an explanation through anything else. In a way, what Robert Rosen 
heretically suggested was a version of the Copemican revolution: the physical earthly 
knowledge is a satellite of a larger conceptual body of life sciences and evolutionary 
ideas in general. The “normal” reductionist angle of vision has been: life, society, and 
mind exist because there is some physical foundation for them. Robert Rosen reversed 
this relation: no, physics and chemistry exist because there is life on earth. No wonder, 
the first step that physics made in explaining life was to declare it impossible because of 
the improbability of a spontaneous assembly of DNA or RNA (Wigner, 1961). 

For more about the scale of sets, chemistry, mind, and some illustrations, see 
Tarnopolsky (2003). 



53 


9. CHEMOLINGUISTRY: A CHIMERA 


In this chapter I discuss a very general paradigm which does not seem to follow from any 
other and is one of the relatively recently discovered foundations of modern chemistry: 

the theory of transition state. 

Honestly, a linguist has no need to know any down-to-earth chemistry. I will 
attempt to present here some almost unknown outside chemistry ideas as chimeras 
combining the properties of both chemistry and linguistics. Atoms will be labeled as 
meaningful words but will behave like atoms. We will use such familiar terms as atoms, 
energy, and even probability, in an intuitive and not in the strict physical and 
mathematical sense. We will try to find some criteria of checking the configurations for a 
measurable property. We will try to arrange our atoms into more or less realistic 
configurations. Coming back to Mark Baker’s “The Atoms of Language," what if words 
were indeed atoms? Then both the expressions and their sources were molecules and the 
way from the latter to the former would go through a transition state, as it happens with 
molecules. 

The tenn “transition state” is somewhat misleading. As all chemists know, it is 
not a state but a process. It is the state of flux. 

The concept of transition state (Eyring and Polanyi,1931) is a very general 
concept of dynamics—the science of things in motion. It is used in kinetics, the part of 



54 


dynamics that studies the speed of the motion, but only if the motion is discontinuous, 
i.e., catastrophic, in terms of Rene Thom. In essence, transition state theory regards is as 
continuous. Can you imagine that? 

Transition state theory (TST), introduced by Eyring and Polanyi ... in 1931 as an early 
attempt to determine absolute reaction rates, is too often considered the domain of the 
chemist or chemical physicist. However, the transition state (TS) is actually a general 
property of dynamical systems which involve an evolution from “reactants” to 
“products.” Such processes include, but are by no means limited to, the ionization of 
atoms, the dissociation or re-action of molecules, and even the escape of an asteroid from 
its orbit (Jaffe et al, 2000). 

Surely, any asteroid is a pending disaster. 

The theory of transition state elucidates why things happen by explaining why 
they do not. It postulates that if one stable state of a system can turn into another, there is 
an ephemeral and unstable transition state between them. Its energy (stress) is higher and, 
therefore, stability is lower than that of both stable states. It is irregular, unlawful, and 
cannot be portrayed by common chemical formulas. 

The transition state sets a barrier on the way of transformation. The lower the 
barrier, the more probable the transformation. This is why a sheet of paper does not ignite 
spontaneously and needs a burning match to push it over the barrier. For the same reason 
explosives can be safely stored and transported: they are protected by the barrier of the 
transition state on the way to the products of explosion. The detonator jolts the substance 
over the barrier. 

Some general notes on transition state will be included into Chapter 11. 
Illustrations of mechanisms of chemical transformations, in both chemical and 
metaphorical terms, are given in Appendix 3. 

As an introductory example, let us turn to the splendid collection of stressed life 
configurations left by Shakespeare. When we speak about a good chemistry between 
persons, we mean that their relation is stable. The initial relations between Othello and 
Desdemona have a negligible stress, low energy, as a chemist would say. The malicious 
energy of Iago initiates the explosive transition like a detonator. The highly stressed 



55 


situation dissipates its energy along the way to another stable but unfortunately tragic 
state. The short-living process, which is the core conflict of the tragedy is what a chemist 
can associate with transition state. The measurable parameter is stress synonymous with 
instability. 

One may visualize atoms as ping-pong balls with a word on each, including 
symbols of chemical elements, written with a soft-tip marker. We attribute to them the 
ability to fonn bonds of various strength measured by the energy: the lower the energy, 
the stronger the bond and the more energy is needed to break it. Energy, therefore, is a 
measure of instability and, on a different scale, of improbability in a dynamical system. 
We will use also the terms stress or tension as the opposite of stability: the stressed 
configurations are those with high energy and low stability. 

The words can themselves be labels of real or imaginary objects and events. The 
theory of meaning is one of the most confusing intellectual areas and we should better 
avoid definitions, preferring models instead. 

Model 1 

The model illustrates what a chemist could expect from three atoms labeled as 
words: Tom, Tim, and book. The name of the transfonnation is GIVE. Its detailed 
description is: Tom GIVE[s] book [to] Tim. We can see the extremely stable Tom, Tim, 
and the book, but no such thing as “s” or “to.” But what is GIVE? 

We observe the following states: 

1. The initial state: Tom in contact with the book and Tim nearby. 

2. The unstable and short transition state of transfer, which is GIVE. 

3. The final state: Tim in contact with the book and Tom nearby. 

Figure 9.1 shows the GIVE transfonnation as a chemist could depict it. It turns 
out that there are two possible mechanisms of the process, for which A and B are two 
possible transition states. 



56 


Since the initial and final states are relatively stable, the book can be in Tim’s 
or Tom’s possession indefinitely. In the act of giving, along mechanism A, Tom holds the 
book and offers it to Tim, who also touches it. For a short while the three atoms are 
locked in the ephemeral and unstable transition state, which is, in general, reversible. 
Mechanism A corresponds to a smooth continuous transfer. Either Tim or Tom, or both 
can change their minds concerning the transfer. But they can also fight for the book. 
Along mechanism B, Tom can simply leave the book on the table, after which Tim takes 
hold of it. The transition state for this version of GIVE is just three disconnected atoms. 



Figure 9.1 Two mechanisms of transformation GIVE from the point of view 
of chemistry. 











57 


From the point of view of a chemist, GIVE is a name of a chemical—better to 
say “chimerical”—reaction between molecules Tom-book and Tim, resulting in the 
molecules Tom and Tim-book. 

How fast this reaction could run depends on the energy (height) of the transition 
barrier. To chimerize more, we can even say that while Tim and Tom can do well without 
each other, a book without an owner is an irregular and therefore unstable configuration, 
while a book with two owners is also a potential source of conflict, unless there is a stable 
bond between the owners. For the forceful transfer, the transition state can be highly 
stressed and its outcome hardly predictable unless we know which stable state is less 
stressed, i.e., who is stronger, Tom or Tim. 

The transition state starts with some really invisible processes in Tom’s mind. To 
make any predictions about the direction and speed of the transformation, we have to 
evaluate an open set of other circumstances not in the slightest way reflected in Figure 
9.1: relation between Tom and Tim, whether Tom has finished reading it, the availability 
of other books that Tim could use as substitute, the influence of other persons who have 
their opinion about Tom, Tim, and book, and even if the weather is better for reading 
than for sailing. This is exactly what a chemist does to study a chemical transformation 
and optimize its course by increasing the speed of the beneficial transformation and 
suppressing all the competitive directions. 

Catalysis is the standard tool for speeding up one direction of structural change at 
the expense of the competitive ones. Catalysis can be compared with the role of a parent 
in a smooth transfer of a book from one child to another: the parent forms bonds with all 
the participants and decreases the stress of the transition state. 

In fact, both types of transition in Figure 9.1 are known in the process of 
chemical transfer of an atom from one place to another, with some additional 
considerations. Once again, the transition states in chemistry, like thoughts in the brain, 
are not typically observable. They are, actually, chimeras of imagination, although some 
progress has been recently made in catching them. But they leave evidence which is 
observable and can be identified by the same detective methods as a crime without a 
witness, including forensic experiments (Appendix 3). 



58 


The above chimerical mechanisms are popularizations of what happens during the 
chemical transformation: it goes through a transition state the energy of which determines 
how fast the transformation will happen in the short run. Tim and Tom exist in single 
copies, but molecules are numerous. In the long run, since the passage through the 
transition barrier is reversible, there will be an equilibrium depending on whether the 
initial or final state has a lower energy. My illustration is intended to draw attention to the 
kinetics of transformation, i.e., the short-run outcome. Chemistry is interested most of all 
in the speed of the transfonnation. If there are more than one direction of a process, 
chemistry answers the question “what is going to happen when the process starts?” in the 
typically chemical manner: 

In the short run, what happens is what can happen faster, 

i.e., what goes through the least stressed transition state, and 

in the long run it is the less stressed stable state. 


The universality of this principle can be seen on any modern war: it is easy to start but hard to 

finish. 

The aspect of speed (usually called rate in chemistry) has attracted little attention 
of philosophers used to the fleshless creations of the mind where everything that can be 
put into words is possible and it pops up in all minds with the same speed. Instead, 
endless debates about body and spirit, fonn and substance, semantics and syntax, 
meaning and sign, thought and utterance, all within the static framework of frozen 
structures, have been rolling through centuries, apparently, without any barriers of any 
kind. 

Why do structural transformations happen at all? 

In chemistry all particles naturally follow a distribution of energies (Maxwell 
distribution), so that most molecules have energy within a medium range. There are 
always particles with energy above that level and their collisions result in passing the 
transition barrier. 

There is the fundamental thermodynamics of open systems, but no waterproof 
theory that could predict or explain their design. Examples of open systems are the 



59 


atmosphere with weather, life with evolution, society and language with history. It is not 
clear whether such theory is possible because any general theory would be silent 
regarding most interesting problems which are always concrete. Moreover, evolution of a 
large open system is an interplay of chance and necessity. The best way to observe 
transitions of large complex systems is to read history, which is a roster of long time 
stagnations interrupted by short tenn periods of turmoil that not necessarily lead to a new 
structure, as well as of long tenn drifts. The Imperial China, Roman Empire, French 
Revolution, and Industrial Revolution are classical examples. Lightfoot (1999) provided 
an excellent concise review of this entire area not only for linguists but also for chemists 
and anybody else who is not shielded from the world by a TV screen. 

For a chemilinguist, the development of pidgin and creole languages (Bickerton, 
1981) are examples of the resolution of an initial stressful situation of the Babel Tower 
type. The mass import of West European words into the old Russian language under 
Peter the Great (the pattern that repeats itself today, so that two alphabets, Latin and 
Cyrillic, could be used intermittingly), the mass invention of a new lexicon on the base of 
the old language in China, and the patriotic defense against imported words in Hungary 
two centuries ago parallel historic processes and are all as much answers to a stress as 
many scientific shifts and technological inventions. One can only guess whether the theory of 
punctuated equilibrium in biological evolution is just a spoof of human history. 

The notion that an utterance has a short-time history and displays in a discrete 
time as a mechanism, i.e., a sequence of states (derivations) was to Chomsky’s 
unquestionable credit. But the main revolutionary discovery made by George Zipf (Zipf, 
1949, 1965), whose name is conspicuously absent from many recent works on linguistics, 
even those taking a comprehensive view of the field (from Jackendoff to Uriagereka, for 
example) was indigestible by static formalism. 

If you call something “move,” how fast it is? Zipf had no direct measure of effort 
whatsoever and identified effort with the word length. While Zipf s theory remains 
controversial, his results are not. A chemist would certainly reformulate the principle of 
the least effort as the principle of the fastest transformation. 



60 


Model 2 

In Model 1 only one direction of transformation was possible. Chemical reactions 
usually run in several possible directions, theoretically, in all of them, which is an 
incredibly dense branching, fortunately, as implausible as the linguistic “discrete 
infinity.” Let us take a more ambiguous case of the direct-indirect object tandem: GIVE 
(ACTOR, RECIPIENT, OBJECT) with several alternatives. It corresponds in English to 
at least two expressions (excluding a bunch of Passive forms): 

1. ACTOR GIVEs an/the OBJECT to-RECIPIENT. 

2. ACTOR GIVEs RECIPIENT an/the OBJECT. 

The first expression can mean that ACTOR gives to the RECIPIENT one of 
several available types of OBJECTS, selecting an apple and not a book or a flower. It can 
also mean (a perilous situation in mythology!) that the ACTOR gives the single available 
apple to one of several RECIPIENTS. Or, it is one of several ACTORs who GIVEs the 
only available OBJECT to the only RECIPIENT. It may be a combination of various 
situations. It may also mean that the object is taken from the ACTOR by force, etc. It can 
be anything observable by an individual or a group. To reveal the ambiguity in the act, let 
us analyze the act itself, representing it as a transformation of a configuration, Figure 
9.2, where the usual transformation A follows by a more complicated situation B . We 
can see a highly uncertain transition state with several outcomes. 



61 



Figure 9.2. Action GIVE represented as configurations 

In the initial state of transformation A, which can be reversible, the actor is 
connected to the object. In the final state, there is a bond between the recipient and the 
object. Between them lies the ephemeral and fleeting transition state in which both actor 
and recipient retain the connection with the object. The transition state has a higher level 
of uncertainty than both stable states because the outcome is not known: the transfer of 
the object can be delayed or cancelled, or the situation can turn into a fight. 

We avoid the term entropy, using uncertainty, stress, ambiguity, and irregularity 
instead, because we cannot calculate entropy without a closed set of outcomes and a 
probability distribution over it, which is so easy in computer models but hardly ever is 
achievable in real life. 

In transformation B, where at least two objects and two recipients are involved, 
the uncertainty is significantly higher and there is a whole array of outcomes regarding 
who gets what, if any. A web of relations of different strength connects the actor and the 
recipients, so that a combinatorial space of transition states can be described by a matrix 
of bond strengths between its components. The matrix can be strongly influenced by 
preceding states and memory traces. The relative affinities of the actor toward the 
recipients, as well as to objects, may be the decisive factor in choosing one recipient 
(object) over the other, as it in fact happens beyond mythology. 


















































































The outlined picture is consistent with the chemical paradigm, which can be 
roughly generalized as: the transformation through the least stressed (less ambiguous, 
uncertain, and irregular) transition state is the most probable one—in the short run. 


62 




TAKE 


J im 


p en 


Jim 


pen 


Jim 


pen 


c 


Sam 


Sam 


Sam 




< 



SNATCH 



GIVE/GET 


Figure 9.3 Variations of object transfer 


The somewhat “thomistic” Figure 9.3 starts with the general situation A of 
transfer of a pen from Sam to Jim where the exact mechanism of is shielded by the gray 
square. Transformations B to D reveal a variety of subtleties in the abstract transfer: 

































































63 


B. Antagonistic relations shown by the two-head arrow suggest a forcible 

takeover. 

C. Unexpected takeover generates antagonism. 

D. Willful transfer against the background of friendly relations. 

These subtleties reveal an important property of the real world: the source can 
have different images and the configuration of the source can be reliably reconstructed 
only from the totality of images. 

The purpose of my illustrations is to show how the circumstances of an 
observable event, including the often complicated relations within its human and material 
participants, influence the outcome. The relations constitute the social framework and 
they are socially meaningful. We sometimes overlook, as with every big picture, that for 
some reason, human mind, language, tools, and society had appeared all together and, 
probably, are just the extrabiological aspects of Homo sapiens. If the tools go back to 
Homo habilis, about two million years ago, so may language. The questions of this kind 
are difficult to answer in any substantial way because the spoken language does not leave 
a material trace. All we know from observations of ourselves and animal societies is that 
communication of any kind is a sine qua non of social life. 

It is not up to a chemist to engage into such discussions. What chemistry 
demonstrates, if abstracted from the material nature of its generators and bonds, is the 
fine structure of the transition from one pattern to another. It portrays a discontinuity as 
a continuous process and this is where it goes farther than Rene Thom. 

In the chemistry of reversible transfonnations, once the system is initialized and 
brought into motion, the final equilibrium is defined in the long tenn by the energies of 
the initial and final state, while the short run process is defined by the height of the 
transition barrier. In the chemistry of open systems, to which life, mind, and society 
belong, the system can be maintained far from equilibrium as long as it is supplied with 
free energy and can dissipate heat, but in what particular form? In the form to which it 
has arrived through its preceding historical odyssey, with all its trials, errors, accidents 
and choices, following the beautiful metaphor of David Lightfoot. 




64 


The transition states are typical for the open non-equilibrium systems that are 
searching for a steady state after having been knocked out of the previous steady state. 
The almost forgotten profound analysis of this phenomenon of homeostasis belongs to 
Walter Ross Ashby, one of the founders of modern artificial intelligence (Ashby, 1960, 
1964). The significance of his analysis for the problems of emergent properties comes 
from the fact that his homeostat was homunculus-sterile and algorithm-free. Regarding 
language origin, there was nobody to teach protolanguage to the first speakers. Language 
had to emerge from the spontaneous interplay of configurations in the mind with the 
configurations of the social life. 

There is a deep parallel between a molecular system and the homeostat: the 
interacting molecules and the blocks of Ashby’s machine spontaneously find a state of 
the lowest energy through a series of short-living transition states of an increased energy. 
The difference is that in the molecular system the number of blocks is very large. The 
most important similarity is that each molecule and each block have all the other kinds of 
blocks in their topological neighborhoods. 



65 


10. TIKKI TIKKI TEMBO: LANGUAGE AS A FORM OF LIFE 

The molecular matrix is a mathematical object of the same nature as Greenberg’s table of 
Arabic roots. Instead of the number of bonds between pairs of atoms, the distances 
between them or the bond strength values could be entered. Bond energy is a measure of 
improbability that the bond will be spontaneously broken. To somewhat vulgarize the 
chemical reality, the probability of bond breakup is extremely low at room temperature 
but goes up with temperature or irradiation. This vulgarization is minimal for 
hydrocarbons, such as the components of mineral oil. 

In general, a simple skeletal graph is completely represented by a matrix of 
incidence consisting of only zeros and ones. Graph is the matrix, and its picture is only a 
visualization. The square matrix, however, can be made as realistic as we want by adding 
qualitative and quantitative flesh to the bare topological bones and inserting numerical 
values into the cells of the matrix. 

For enthusiasts of long mental leaps, graph as a mathematical object is a beautiful 
example of an extralingual language universal, just look at Appendix 2. If it sounds 
oxymoronically like “the empty category is not empty,” it is only because such terms as 
language and life each had acquired double meanings after the advent of computer 
science and molecular biology. There are human language and biological life, but there 
are also formal languages of programming and mathematics. 



66 


As soon as we have a population of interacting configurations, there is a 
possibility of an Artificial Life system. 

In the most general tenns, life—it could be called meta-life—is a system capable 
of (1) replicating itself with (2) errors (3) while using a limited resource of energy. The 
computer that uses the same tiny energy to display any picture on its monitor is an 
extremely misleading device for those who wants to live in the real world. People will not 
switch from despotism to democracy just because democracy and despotism are 
the words of the same length and require the same effort to type them on the keyboard. 

It is impossible to review here the modern ideas about life-like systems, and the 
following remarks will be fragmentary. The major problem with any large picture is that 
you cannot find a large enough frame to hang it on the wall. You have to reduce it to a 
size where many important details are just specks. The picture painted by Devlin (2000) is more a 
window than a picture: it allows you to see the large world through its modest frame with the help of the 
optical abilities of mathematics. 

For the Darwinian biochemical life, duplication and errors (mutations) are 
obvious, while the limited character of resources of matter and energy are not always 
kept in mind. 

Heat alone cannot be directly utilized by life unless it is processed by power stations and internal- 
combustion engines and other thermal machines. Only humans have been capable of doing this, albeit only 
after 1700. This brought to life modern civilization, which is also a form of meta-life. Technology, the most 
conspicuous part of modernity, uses blueprints and descriptions the same way living cells use DNA for 
their replication. The limited resources of technology are capital, labor force, matter, energy, and 
consumption. This line of discourse would certainly detract us from protolanguage and I have to stop here. 

There is a trend in linguistics that views language in terms of population 
dynamics, i.e. as a form of generalized life. For initial references, see Komarova and 
Nowak (2003), . Similar or related ideas can be found in works of James Hurford, Simon 
Kirby, Angelo Cangelosi, and others. 

Language reproduces itself with mutations within the social communication. But 
what is its limited resource that exerts a selective pressure? It is not energy, because 
verbal exchange does not take a lot of energy. I believe, not claiming any originality, that 



67 


it is time for both expression and understanding. If communication is too slow and far 
behind the pace of events, it fails to perform its function and the bottleneck linguistic 
phenomena die out. The fast development of a distinctive language for wireless text 
messaging by teenagers is a supporting example. 

The situation can be illustrated by the Japanese folk tale Long Name, known in 
USA as the Chinese folk tale Tikki Tikki Tembo (Tikki, WWW, A) . 

As I remember it from my childhood, the parents gave their boy a very long 
name to ensure his happy and long life: 

Tikki Tikki Tembo No Sarimbo Hari Kari Bushkie Perry Pern Do Hai Kai Pom 
Pom Ni kk i No Meeno Dom Barako. 

When the boy fell into a well, the children around started to call his parents but 
could not chant his name correctly and had to start all over again. While they kept trying, 
the boy drowned. 

NOTE. There was quite a discussion on the Web about whether Tikki was a Chinese or a 
Japanese tale (Tikki, WWW, B). I remember it as a Japanese tale in a Russian translation, 
and Ariko Kawabata, a participant in the discussion, confirmed it, which does not exclude 
its Chinese origin. 

There are two “biological” approaches to language. One regards language as a 
classical biological adaptation (Pinker and Bloom, 1990), while the other one is to see it 
as a fonn of meta-life (general reviews: Komarova and Nowak, 2003, Hurford, 2003). 

The second trend opens the Pandora box of computer simulations, but the approaches do 
not conflict. 

I believe it is appropriate to refer here to Cavalli-Sforza (2000) with his 
panoramic view of human evolution. He drew close evolutionary parallels of genome and 
language—parallels that did not necessarily intersect. 

The problem with adaptation in Darwinism lies in the circularity of the concept of 
fitness. Cavalli-Sforza had a clear view of fitness as the rate of reproduction: the genome 
that reproduces itself faster is more fit. This approach, being essentially Darwinian, 



68 


breaks the vicious circle by offering a quantitative measure of fitness regardless of the 
material nature of meta-life and identifies fitness with nothing but the observable speed. 

It is accepted in ALife. 

The entire treatment of life as competition of replicating sequences for a limited 
resource, it must be noted, comes from Manfred Eigen, a Nobel Prize recipient for 
chemistry (1967). His meta-chemical ideas were originally expressed in Eigen (1971- 
1978) and later in Eigen and Schuster (1979). Some elements of Eigen’s theory, in the 
form of population dynamics, can be also found in Komarova and Nowak (2003). Eigen 
used a linguistic example for illustration (Eigen, 1977). This is how the initial sentence 
evolved after a number of reproductive cycles in a life-like model: 

1 TAKE ADVANTAGE OF MISTAKE 

5 TAKF 1DVALTAGE OF MISTAKE 

10 TALF ADVALTACE OF MISTAKI 

70 TAKEB 7VALTAGI LV MIST1KE 

The main thesis of kinetics is neither chemical nor physical. It is based on 
common sense: the more molecules are present in the unit of volume, the higher the 
probability of their collision, which increases the probability of the successful collisions 
that lead to transformation. The limited resource in chemistry is strictly material: the 
fastest reaction pulls the rug under the feet of competing reactions by consuming their 
starting material, common for all, so that the slower reactions are increasingly suppressed 
with time. 

Eigen’s works started the entire area of Artificial Life. They are too technical to 
be explored here, but a whole host of beautiful and general ideas on evolution, fitness, 
language, and music, against a rich cultural and philosophical backdrop, can be found in 
his popular book with Ruthild Winkler, originally published in 1965 (Eigen and Winkler, 
1993) well before his detailed works on molecular evolution. It contains a chapter on 
molecules and language. 

I cannot resist the temptation of quoting Eigen on Chomsky: “..we could say that 
Chomsky’s linguistics applies to language in the same way that thermodynamics does to 



69 


the weather,” which is hardly a compliment from a natural scientist. For the context, the 
reader should look for the page 269 of Eigen (1993). In the same book one can also find 
a discussion of the frequency analysis of intervals in musical compositions very similar to 
Greenberg’s analysis of Arabic roots. Of course, more modem sources can be found. 

The forms of meta-life, other than biochemical life, are language, culture, customs, 
society, technology, science, art, even some games, and the list is open-ended. All of 
them originated and have been evolving on the platform of biological life, under which 
we find nothing but chemical reactions. The term Artificial Life, which originally echoed 
Artificial Intelligence, can be used instead of meta-life in both meanings: as human 
simulation of life and as forms of meta-life created by human reason and hands, whether 
intentionally or as the game of chance and necessity. In the context of Darwinism, the 
non-biotic life was characterized by Dawkins (1989) as life of memes, the cultural and 
mental counterparts of genes. Note, however, that it was Zipf who first used the 
expression “genes of thought.” 

If we accept the kinetic concept of evolution, then the significance of the main 
linguistic idea of Zipf (1949, 1965) becomes obvious. The patterns of live speech are 
selected along the criterion of least effort, comprising the shortest length, fastest utterance, 
and its fastest subsequent understanding. 

To complement the ancient Chinese tale, here is a modern example. The Russian 
post-Communist revolution brought the abbreviation MREO UGIBDD GUVD, which 
itself is so long that it asks for another abbreviation, probably, MUG. It means: 

The local office of the inter-regional office of technical inspection of 
transport of the department of state safety inspection of road traffic of the 
top directorate of internal affairs. 

This Tikki Tikki Tembo of a kind corresponds to the American auto registry. 



70 


11. ZIPFING THE CHIMERA 


In order to describe a source, we have to flatten, turn into a string, and squeeze its 
configuration through the bottleneck of speech. This may require breaking some bonds 
and forming some new ones, which is exactly what chemistry is about. Speech generation, 
from the point of view of a chemist, is a “chemilingustical” reaction. As any natural 
process in a non-equilibrium system, it requires free energy, which is physical energy in 
the form convertible into work. It is supplied to humans with food. Naturally, if language 
helps to get more food, it will survive in a population. There is nothing in 
thennodynamics, however, that predicts the origin of language. Not intending to engage 
further in a discussion of this large and difficult topic (see Pinker, 2003), I mention it just 
to point to a large area of the big picture: the adaptive aspect of language. The very fact 
that language is an adaptation in no way predicts its structural properties. In a sense, 
everything that exists in living systems is adaptation. 

Figure 11.1 depicts the essence of the process of linearization of a non-linear 
configuration. The initial state is what is called image in PT. It is the configuration of the 
source processed by organs of perception—another vast topic of cognitive sciences. 

The configuration space is the first of the regular structures that we are building. The 
next consists of images, a concept that formalizes the idea of observables. In other words, 
a configuration is a mathematical abstraction which typically cannot be observed directly, 
but the image can (Grenander, 1996, p. 91). 



71 


Unlike thoughts, images in individual minds can be compared and shared by 
comparing and sharing their verbal descriptions, drawing pictures, imitating, pointing, etc. 
Two people almost always agree whether the animal in sight is an elephant or a mouse. 



D <; 

g —> B—D—C—E—F 

¥ 

Structure change 



I 

B C 

\ / 

D 

/ 

E -F 

T 



3 

I 

I -1 

B—D—C—E—F 
I —I 


t 


= t 

Physical metaphor 


Figure 11.1 Linearization of the source into the output string 

A lot of legitimate questions of epistemological character could be asked here: 
what is the relation between the source and its image? How can we know anything about 
the source if we have only its images in the brain? Do we really have an image of 
somebody’s image, etc. I will not respond to any of them because, as a chemist with an 
upbringing of an experimentalist, I have to abide by two principles: to draw the 
distinction between observable and hypothetical and imaginary objects and the distinction 
between an established consensus and arguable approaches. This does not mean that I 
consider philosophy worthless, just the opposite. I believe it is still not only waiting to be 


72 


called science but also longing for it. I can only hint that the relation between the “an 
sick” (Kant) configuration and its image in PT may point toward a promising direction of 
investigation because both are represented in the same mathematical language. 

In Figure 11.1 linearization is shown in three aspects: 

1. Transition over the stress barrier from the initial to the final state; 

2. The structure of the initial and final states with fuzzy intennediate state; 

3. A physical metaphor of the process as a mechanical squashing. 

In this way the unyielding porcupine of the source is pushed feet first through the 
apparatus of speech. I like to call this process zipfing to emphasize that it must be done 
with the least effort in order to compete with other modes of sound or sign 
communication. The least effort is required at all stages: to understand the image of the 
source, to break it up into singlets, doublets, and larger fragments and to align the 
fragments for the output. Note, that these stages form a cycle because the next stage will 
be again the understanding by a listener—the situation explored by Simon Kirby, James 
Hurford and others, see Kirby and Christiansen (2003). This brings us back to the ideas 
of Manfred Eigen about hypercycle and right into the identification of language as a form 
of life. 

The residual stress (A stress ) is the measure of the irregularity of the output as 
compared with the source. 

NOTE. A nit-picking chemist, as well as linguist, may inquire about the material balance of 
generators in linearization. How can we form two doublets with the same generator if there is 
only one generator of each kind? Here is a hint: there is a population of generators in the mind, 
similar to a population of molecules in a volume of liquid, but it is a population in time, not in 
space. For example, stuck in traffic, we repeatedly return to the idea of being late to a meeting, 
although any idea exists in a single copy. The time population of this idea is much larger under 
the circumstances than the idea of the world energy crisis. 

Figure 11.2 further illustrates the idea of chemilinguistry. In A, the resulting 
triplets have the same degree of stress and are equally probable. The transition state can 
go either way. In B, the resulting SOV (Subject-Object-Verb word order) is, arbitrarily, 
more stressed and less probable than SYO because V->0 in the source changes its 



73 


orientation to the irregular O-^V. The least effort condition is partially satisfied if the 
topology of the source is maximally preserved during linearization. I have to remind that 
we are dealing here with protolanguage, which has no grammar except the preservation 
of topology. 

In chemistry, the structure that forms behind the lowest transition barrier 
dominates the final state. This principle, if applied to language, transforms into the 
following hypothetical definition of grammar in a meta-chemical but by no means 
metaphorical sense: Grammar is the catalyst of language generation 

Catalyst in chemistry does 


t*.. 


> 



Figure 11.2 Examples of linearization 
A: Fork-like source; B: Triangular source 


exactly this: it decreases the stress 
(energy) of the transition state. 
Grammar decreases the stress of 
transition state because of the 
preservation of linear fragments longer 
than doublets and because of 
introducing syntax, i.e., the means of 
topology preservation other than 
simple adjacency. 

It seems to me that what Noam 
Chomsky has been searching for is the 
formula of this catalyst. Of course, this 
catalyst must be innate and, of course, 


it must be part of a larger picture. And of course, we still do not kn ow what it is. 

The short run situation, controlled by kinetics, applies only to the spoken 
language. An assiduous writer who has plenty of time to think and to rewrite the text and 
is not bothered, unlike Hemingway, by physical ailment, can really put the patience of the 
reader to a test by exhaustingly long sentences. An elitist writer and elitist reader will find each 
other like a sadist and a masochist. 

As a self-illustration of the Zipf Principle, having little patience by nature, I prefer 
the short term stress to the somewhat longer but more accurate terms irregularity, and 









74 


improbability. I could also say energy, but stress and energy are opposites in pop 
psychology. 

In order to legitimize the use of chemical analogies beyond the general pattern 
approach, we have to explain in what exactly way the chemical systems are comparable 
to the language generation systems. And what is catalysis, anyway? 

Procreation, cooperation, competition, and social order need acts of contact and 
exchange. In the molecular dynamical systems the random events of exchange are 
collisions of particles. Collision is predominantly a binary event. Molecular system has 
no memory of its previous state, unless it is alive. 

Switchboard systems are an alternative kind of dynamical systems. Modem 
telecommunication makes a physical contact at home or a marketplace unnecessary. 
Telephone communication is one example. Mind is another one: it acts as a switchboard 
system on short segments of time and the neurons do not dash around inside the scull. 

Figures 11.3 and 11.4, saving a lot of words, compare molecular and 
switchboard systems . 

The switchboard (SB) dynamical systems are not the same as the connectionist 
systems in Artificial Intelligence, but I cannot refer to any source other than my own 
description. As far as connectionism is concerned, this is another contentious area from 
which I would like to stay away. 

The SB system is just a mathematical image of a certain spontaneous activity 
which can be mapped onto molecular processes, while the connectionist networks are 
through-flow processing systems, usually, with feedback or under external control. The 
events in SB systems are momentary connections and disconnections between elements 
of the set of sites presented as small circles on a larger one. The physical and biochemical 
mechanism of the switching is irrelevant. The contacts involve a certain medium which 
can enhance the connection or hinder it, as well as remember it for some time. The fading 
memory of previous connections is represented by dotted lines. There are ephemeral 
connections, as well as long-lasting ones. The “movement” in the switchboard is 
spontaneous and has the properties of ordered chaos. This is completely opposite to any 
computer, even the one simulating ordered chaos. 



75 


With all the differences, both molecular and SB systems are, to some extent, 
conservative. While the isolated molecular system maintains its energy, the SB system 
requires external free energy (i.e., energy in the form capable of introducing order, 
unlike the chaotic heat) to make and unmake connections. It is thermodynamically open, 
but the constraints on the supply of energy limit the activity in the same way as 
temperature limits the average number of collisions in molecular systems. The SB system 
is thermodynamically similar to life because it stays far from equilibrium until the supply 
of energy lasts. 

I am not aware of anybody working with a computer on a limited energy supply 
in such a manner that the computer itself is tweaking its software to do maximum 
computations per unit of energy. Such a computer would be a true model of the mind, but 
I believe that it could be created only within a population of similar computers 
exchanging segments of software and capable of errors. The open source software in the 
community of programmers is close to this fonn of evolution. 

I am sadly aware of a large and menacing boulder of software capable of unpredictable errors but 

protected by monopoly from selective pressure: I am using it for typing this paper. 

Figure 11.3 needs little explanation. The molecular system is a series of collisions 
and the SB system is a series of connections fading with time. The dramatic difference is 
that the first one is completely chaotic while the other one is a dissipative system. The 
significance of this fact is enormous, but this is not a good opportunity to expand on it. 
Instead, the reader should turn to the numerous works, many of them popular, of Ilya 
Prigogine, for example, Prigogine and Stengers (1984). For a linguist interested in a 
larger picture, this area, however difficult, could be stimulating. It could suggest, for 
example, some against-the-tide research aiming at the connection, denied by many, 
between material culture and the character of a particular grammar. The production of 
free energy in the form of food is a necessary condition for an evolution of a complex 
living system. Some languages may be better than others for this purpose and they can 
themselves evolve faster. 

The six images of Figure 11.4 illustrate the concept of catalysis. I will speak about 
it mostly metaphorically, but excellent popular sites can be found on the Web. First three 
images portray the molecular systems where particles are chaotically moving. The 



76 


catalyst limits the freedom of movement of colliding particles by forming fleeting bonds 
with them. It works as a biased switchboard operator who tends to connect his friends at 
the expense of all the others. When the catalyst is immobilized, for example, on a solid 
surface, the chemists speak about heterogenous catalysis. 



Molecular dynamical system 




Switchboard dynamical system 

Figure 11.3 Molecular and switchboard systems 

In images 5 to 6 the space is topological but non-Euclidean (the latter is also 
topological). There is no movement in this space and all generators are immobilized. It is 
the configuration space of PT. 

When a connection associates with a new generator (here comes novelty), it can 
symbolically represent the connection in any further configuration with its participation. 
Conversely, the symbol evokes its original meaning (image 5). In the case of learning 
(image 6), the frequent connection becomes permanent. 

Figure 11.5 illustrates the template catalysis, which is employed by life for most 
important and intimate biological functions. This catalyst is a very large molecule, 
comparable with text, and it is active only at a small and moving area, as with reading a 
text. Moreover, as any text, it is capable to copy itself, which is what life means from the 
point of view of chemistry. 














































77 





Figure 11.4 Catalysis and learning 


The concept of transition state, which I touch upon only superficially in Chapter 

9 and here, but more in depth in Appendix 3, is 
still little known outside chemistry and politics. 
It has been spreading recently as a new domain 
of complexity. Transition state begins to look 
like a paradigm of a universal importance of the 
Figure 11.5. Template catalysis same magnitude as thermodynamics and basic 

laws of nature. It may be underlying a lot of 
various phenomena, from geological activity such as earthquakes and volcano eruptions 
to “punctuated equilibrium” in biological evolution to the flight of albatross. It may turn 
enlightening in the study of evolution of language and society. 

The concept of punctuated equilibrium (Eldredge and Gould, 1972, Eldrege, 

1985), sharply debated in biology, found a more hospitable soil in sociology, for example, 
in theory of organizations. The reason for that seems to be rather trivial: social 
institutions, unlike biological evolution, are directly observable. What follows from the 
observations in history and sociology—and even from individual human experience—is 




































































78 


the alternation of long (most probable) periods of stability and even stagnation with the 
turbulent (less probable) periods of intense transition from one stable state to another. In 
mathematics this pattern is known as Levy process or Levy flight. 

On a smaller scale, a similar pattern can be seen in the behavior of birds and 
animals, such as albatross (Viswanathan, et al , 2002) and jackal. The imitation of a 

typical pattern is shown in Figure 11.6. 

- The already mentioned work of Ashby (1960, 

1964) on modeling homeostasis with a set of 
interconnected electric devices adds another 
a >q certificate of generality to the concept of transition 

/ state. In his experiments, Ashby used a system of 

interconnected mechanic units exchanging electric 
input and output signals with each other. The units 
could be in one of a series of states each. The 

Figure 11.6. The 2D pattern of 

Levy flight system could ultimately find its state of 

equilibrium. When disturbed by the experimenter, 
the system began a frantic and apparently chaotic search for a new state of equilibrium. 
Having found one, it could stay in it until the next perturbation. Ashby characterized this 
property of the system as homeostasis. He considered it the basic property of natural and 
artificial dynamic infonnation systems. This alone was not new. The novelty was in the 
observation of the excited, unstable, and short-living transition state in a system of a 
switchboard type. The units, sitting on a bench, exchanged not by collisions but by 
information. Moreover, it was a dissipative system. 

There is a wonderful discussion of punctuated equilibrium and related topics in 
Lightfoot (1999), who, by the way, mentions both Darwin and Marx as the founders of 
general evolutionary theory. It is hard to dispute this juxtaposition: in fact, Marx 
complements Darwin by his concept of history punctuated and driven by revolutions, but 
he borrowed the idea from Hegel. Anyway, both Darwin and Marx have no peace in their 


graves. 

What is the truth? The truth is a consensus. This is why there is no eternal truth. 

More about transition states in society and mind, see Tamopolsky (2003). 




79 


12. A CHEMIST AND A CHIMP SPEAK NEAN 


Nean is a sequence of single words and, mostly, word pairs that represent the 
bond couples in the source by word adjacency. The relative significance of generators in 
the source (i.e., topicalization, also called focusing) is expressed in Nean by their 
frequency (this is very PT). An utterance in “English Nean” would look like this: 

Ug Og Ug meat 

Nean is, actually, a pure grammar and it can be applied to any lexicon, assuming 
that the words are just labels for generators. We can speak, write, and sign in Nean using 
words of any language, as well as arbitrary symbols, as long as we have a source and do 
not fret over the philosophical relation between the “real thing” and its symbol. 

I see protolanguage itself as a transition state toward a grammaticalized spoken 
language. On an evolutionary timeline, Nean sits in the very beginning of the transition to 
language. Right before the stage of developed language, Nean may take a somewhat 
more sophisticated fonn of the “haploid Nean” (see Chapter 7). 

Beginning to speak, the children recapitulate the stages of Nean by moving from 
single words to doublets, triplets, and multiplets. 

The subsequent development of grammar is beyond the scope of this paper. 

Plenty of good ideas can be found in Language Evolution (2003) and large literature, to 



80 


which a chemist can hardly add anything. The chimeric approach, however, may 
illuminate the problem from a particular angle. Thus, we can hypothesize, within the 
grammaticalization theory, that the inflections develop from words (see a short 
perspicuous overview by Hurford, 2003), especially those that signify large classes of 
objects. We can also hypothesize that verb is a later invention (from words like hand, 
foot, and mouth for verbs do, go, and eat) and this is why the SOV word order 
dominates at earlier stages of language evolution. If we regard the inflected word, like (he) 
SPEAK-s, as a fonner doublet, later collapsed, the genesis of morphemes with syntactic 
function becomes a natural continuation of Nean. Uniformitarianism—the idea of deep 
similarity of all languages—may have a point after all. This is exactly what I mean by 
saying that all languages perfonn the same function—linearization of the source 
configuration—by the same means of preserving binary connections during zipfing 
(a.k.a., trash-compacting). 

My own experience of a translator tells me that it is impossible to render a 
technical or in any way special text unless the translator has reconstructed a mental image 
of the source. Moreover, such languages as Legalese and Patentese may require 
translations from English into English. The situation is more subtle, but basically the 
same with poetry. Understanding is the ability to convey the source to somebody else. 

The language cannot be outsourced. 

It is sufficient to turn to the reviews by Tomasello (2003), Hurford (2003), and 
others from the same source to see that my chimeric vision of Nean is by no means 
revolutionary. There is a lot of hands-on work going on in linguistics that would please a 
chemist-realist, who would, however, still distrust any computer simulation in which no 
thermodynamics (with an energy-like parameter) and kinetics (with a transition state) 
are involved. What Nean could mean against this positive and energetic background is, 
paraphrasing Michael Tomasello, an “evolutionary fairy tale with which to begin” 

(instead of “conclude,” Tomasello, 2003, p. 108). 

In this Chapter, I illustrate the transition from a source to the utterance with 
examples. If only the meaningful words are left, the previous sentence of this paragraph 
can be in various ways decomposed into doublets, for example: 



81 


I illustrate 
illustrate transition 
transition source 
transition utterance 


source utterance 
illustrate example 
Chapter illustrate 

(and others) 


In a context, the shredded sentence still tells something. 

Using haplology, we can obtain some triplets and longer oligomeres: 


I illustrate + illustrate transition = I illustrate transition 
transition source + transition utterance = transition source utterance 
(if there is no doublet utterance source, then transition source 
utterance is more probable than transition utterance source). 

Chapter illustrate example 

etc.; one can play with this linear LEGO. 

As it can be seen, the choice between “triangular” structures like transition 
source utterance or transition utterance source depends on the orientation of the 
bond between utterance and source. We could see that in a very general form for the 
basic syntactic fonnula of word order, Figure 11.2 (Chapter 11). The linearized 
configuration will depend on the direction of the V,0 bond. This is, of course, pure 
speculation, but it could be up to a linguist to confirm or reject it by observable facts, and, 
quite possible, such facts are already somewhere in the literature. There could also be a 
proof that SVO corresponds to a better zipfing in a faster moving world with longer 
utterances. By the way, the growing string length per source applies a harsher selective 
pressure regardless of the pace of cultural progress. 

Figure 12.1 presents further illustrations to the case of linearization of a triplet in 
the case of the basic word order. The fuzzy non-oriented transition states in the middle 
column are obtained by breaking a single bond of the source. They lead to the final states 
in the right column. We can also see how the factor of the relative strength of a bond 
could work. But what can we say about the invisible transition states, whether in 
chemistry or in linguistics? Isn’t it the same as fighting the Gothic monsters of formal 



82 


theory? Well, I can’t deny it is, to some extent, but there is an important difference, better 
seen from the chemical side. 


IS 

TS 

FS 

1 

0 ill 


|:::::::::::::SVO 

WMM 

<t s 

o ill 

iiiii 

|HHHH:!Uns ° v 

^ v ij: 

1 : : : 


lliHvso.osv 

s 1 

iji 


S m®j®SOV,VOS 

s-i in 

11 pi 

ijijiiijjjjjjjvso 

| 

^*0 i;| 


t;: i: i: i::::: i: 

mil 

innnnnnisov 

. V 

« ^ k =: 

.. 

ill® 

...... 1V.V.V.V.V.'. 

^f:iiisov,vso 

0 jlj 

W0M 


Figure 12.1 Initial (IS), transition (TS), and final 
(FS) states for S,V,0 order. Stronger bonds are 
marked by thicker lines. 


The so-called Hammond postulate in chemistry—one of a few most fundamental 
ones—illustrates how chemistry handles transition states. George S. Hammond 
(Hammond, 1955), being perfectly aware that his postulate was unprovable at the time (it 
could be proved or refuted someday), thus formulated his idea in the language 
comprehensible by a chemolinguist: 




































































































83 


If two states, as for example, a transition state and an unstable intermediate, 
occur consecutively during a reaction process and have nearly the same energy content, 
their interconversion will involve only a small reorganization of the molecular structures. 
(Hammond, 1955). 

It means that if two structures along the transition pathway have similar stability, 
they have similar structures. One of the corollaries is that in the beginning, the transition 
state is closer to the initial and in the end to the final state. 

What Hammond did was taking the general idea of transition state formulated in 
tenns of energy from physics and adding to it the structural aspect alien to physics. 
Hammond postulate is so general and so extra-chemical because it is stated in two 
universal scientific tenns that belong to all natural sciences: energy and structure. It is the 
“lack of energy,” sorry for the pun, that distinguishes all formal linguistic theories. 

I have no firm ground under my feet, however, when stepping on a linguist’s turf. 
It is quite possible that the formal theory is compatible with an adapted Hammond 
postulate in one form or another. 

What Hammond postulate itself lacks, sharing this shortcoming with formal 
linguistics, is the evolutionary aspect. It is of no importance whatsoever in the chemistry 
of simple systems, but is crucial in the study of any complex natural system, such as 
language or biochemistry. All those systems have evolutionary memory. The 
generalization of the concept of transition state over the evolutionary transfonnations, 
which I suggest, would mean at least a temporary completion of the universal theoretical 
framework for complex systems, provided the non-equilibrium thermodynamics is also 
included in the picture. 

The order in which the generators of the source configuration historically 
appeared—the heredity and genealogy of the language—is an important part of the 
explanation of its current state. 

And in some cases this new function of the word is the first instance of this function 
being fulfilled at all, in the language concerned (Hurford, 2003, p. 52). 



84 


In other words, the present of the complex system depends on its past. Chemistry 
is based on the opposite principle: the properties of a molecule do not depend in any way 
on its origin. 

To be more exact, the distant future of the complex system depends on its close 

past. 

To complement the above quotation from George Hammond, here is a quotation 
from Nim Chimpsky, an educated chimp: 


Give orange me give eat orange me eat orange give me eat orange give me you. 

(Terrace, 1970, p.210) 


This is a perfect Nean. 

Still, I will have the last word, quoting myself, see page 29: 

C—Hj , Hj—C , C—H 2 , C=0, Hj—C , 0==C , C—Hj, C==0 !!! 


Hey, my Nean is as good as Nim’s. 


85 


13. SCENES FROM THE CAVE LIFE TOLD IN NEAN 


Hurford (2003, p. 53) notes that the study of grammaticalization requires going 
backwards from the modern to the earlier and simpler stages of language. Thinking 
backwards from the products of a transformation to its initial state plays an important role 
in the daily work of the chemist. For example, thinking about the best way to synthesize 
a naturally occurring drug, an industrial chemist develops a converging tree of routes 
leading to the goal. Another chemist may imagine a different tree of pathways leading to 
the natural synthesis of the drug in the plant. As the next step, both have to select a few 
most probable transformation chains, which is done using different selective criteria. The 
industrial chemist uses the overall cost as the criterion, trying to involve a minimum of 
intennediaries and byproducts, while the biochemist looks for particular intennediaries 
and byproducts. Let us take note of both completely compatible investigative approaches. 
In criminal investigation, the first is echoed by cui bono, looking for the one who profits 
most from the crime, while the second corresponds to collecting material evidence. 

Figure 13.1 shows three kinds of problems arising in chemical research that are 
common for all studies of structural transformations, including human history and politics. 
Another general question can be added to them: what else could happen? 

It is impossible to reconstruct the origin of language in the absence of any 
however fragmentary data, i.e., intennediaries and byproducts, in the chemical parlance. 



86 


What may be possible is to understand the principles that guide the evolution and find or 
reinterpret some evidence available at its later stages. 



Figure 13.1 Three kinds of chemical investigation: 

1. Can L come form K? 2. Will M generate N? 

3. Which pathway from P to Q is optimal regarding 

condition C? 

In this Chapter, I would like to recapitulate some ideas and expand on the 
chimeric principles of historical investigation of protolanguage, without claiming any 
positive results, but using positive examples. 

Since we are interested in the development of language from protolanguage, let us, 
as it is appropriate for a chemist, go back from the following full-language expression: 

In the cave Ug grimly gives a bone to Og. 

The source of the expression is an image of a situation in the (non-Platonian) cave. 
The complete situation, which may not be seen in all detail by each witness, can be more 
complex, regarding the background, participants, objects, time, weather, motives, 




87 


physical health, social tensions, etc., and a history, in which the current source is just a 
limited projection (image) of the most recent episode. 

Our strategy is to hypothetically reconstruct the way from the source toward its 
transformation into the linear expression, using our knowledge of the caves, inanimate 
objects, animals, humans, and their interaction. We assume that the laws of nature were 
the same in the ancient cave as they are today. 

We can attempt the reconstruction by comparing various sources that can lead to 
the same expression. We can do that only by using the same language for both source and 
expression, which is the language of PT: our objects are configurations with a connector 
graph and numerical properties of the generators and bond couples. These properties are 
similar to the numbers in the Greenberg tables, but here we assign them to a large extent 
arbitrarily and just see what comes out of it. The following play with various sources is 
similar to chemical experiments in which we change conditions, observe the results, leam 
something, and design new experiments to leam more. 

To portray the source configurations, we will use a large circle with generators 
positioned on it as small circles, symbolizing the switchboard system. The bonds may 
have direction and be of negative strength, corresponding to repulsion. 

It is necessary to establish some principles of attributing direction to a bond. We 
can assume that there is a certain order of precedence between generators. For example, 
both Ug and Og have, probably, existed for about twenty years, but the bone appeared 
only today. This is why we have Og -> bone orientation. Similarly, the good old Ug 
gives the freshly cleared bone only today and not everyday: Ug give. By the same 
logic, we should direct give to Og: give -> Og. The configuration of Source 1 is 
largely arbitrary. Our goal is not the right configuration but to see how the change of the 
source influences the transition state and the output. 


We don’t know what is true. Again, the truth is a stable consensus, always temporary. More on 
this subject, see the famous Kuhn (1962), which is another illustration to the theory of transition 
state. 



88 


I need one more digression. Note, that this approach has to keep in both the 
source and its representation in the same frame. Therefore, hypothetically, knowing 
something about the evolution of the source, for which we may have some hard 
archeological data, we may draw conclusions about the language change. This is not a 
popular point of view, but Newmayer (2003, p. 73) tends to go against the (political 
correctness) tide and there is not a single principle of nature to support the tide. 

Thus, it seems probable that the English language experienced one of the most dramatic 
changes because of the active, turbulent, and charged with energy social evolution, plus the ethnic 
mixing of England during its early history. The cultural factors (education, press) later stabilized the 
evolution. 

The long and well documented history of Sumerian language is a great source of evolutionary 
facts (for a quick look, see Elalloran, WWW), although not about the spoken form. Lightfoot (1999) 
assembled and analyzed rich and intriguing material on language change in his both detailed and 
“high-ground” book and placed it on a large picture with history as natural process in its center. I 
believe that the research in the co-evolution of economy, social order, culture, and language may bring 
some positive results. Language, like tea in England, beer in USA, vodka in Russia, and mate in South 
America are means of social bonding. 

Bottero (1992), in his breathtaking book, gives a remarkable opportunity to peek 
into the mechanics of the ancient mind through the tablets with interpretations of dreams. 
We see how words become the source of thoughts, reversing the usual order. 

Let us start with the following source: 

Source 1 

This configuration reflects our belief that it is Ug 
who looks grim (bond 3 between generators 1 and 6), 
Ug may or may not give and is, therefore, a primary 
generator regarding give, bond 1 directed toward 
generator 5 and is especially strong, etc. We direct 
bond 5 toward bone and just see what follows. The 
picture is utterly hypothetical and intentionally 
complex: everything seems connected to everything. 








89 


Of course, it is not: neither bone and grim nor Og and grim interact. Unreasonably, 
however, the bond Ug - Og is absent. 

The generators of the configuration are labeled by six numbered words in the list: 


words = Ug Og bone cave give grim 

1 2 3 4 5 6 


The source configuration is coded in the 6x6 connectivity matrix, the non-zero 
elements of which are the bond affinities, taking only two arbitrary values, 1 and 2. 
Instead of affinity I will use a shorter score. The bond with score 2 is stronger (more 
probable) than the bond with score 1 and it is denoted with a double line. The diagonal 
could, in principle, reflect the weights of the generators. For the sake of experimentation, 
the scores are assigned to the bonds intuitively and tentatively. 

1 2 3 4 5 6 

1 

2 

3 

4 

5 

6 


0 

0 

1 

1 

2 

1 

0 

0 

1 

1 

2 

0 

1 

0 

0 

1 

2 

0 

1 

1 

1 

0 

1 

0 

0 

2 

2 

1 

0 

0 

1 

2 

0 

0 

0 

0 


For practical reasons, we will code the same information in the form of sparse 
matrix S, which lists not the generators but the non-zero bonds. 


S = 

115 2 1 

2 2 5 2 2 

3 16 12 

4 3 5 2 2 

5 2 3 1 1 

6 4 5 1 2 


7 14 12 

8 2 4 1 2 

9 13 12 

10 3 4 1 2 


Columns: 1 2 3 4 5 




90 


In rectangular matrix S, the first column is the bond number, next two columns 
are a pair of connected generators, column 4 contains the score of the bond and the last 
column indicates whether the bond is directed (1) or not (2) from the first to the second 
generator in the row. 

The linearization is performed by a simple MATLAB program called, as the 
language, nean (Appendix 4), which can be easily modified and improved by anybody 
more experienced in programming than myself. It is probably easier to create it from 
scratch. 

The input data are matrix S, array words, integer NN and real number score. 

Example: The input of words and S for Source 1 is: 
words= [TJg 'Og ’bone 'cave ’give ’grim ’]; 

S =[1 1521;2252 2; 3161 2; 4352 2; 5231 1; 6 4 5 1 2;7 1 4 1 2; 8 2 4 
1 2; 9 1 3 1 2; 10 34 1 2]; 

The program does the following: 

1. Using the function of random permutation, it generates a random linear 
sequence of generators in words; 

Example: Ug bone cave give grim Og 

2. checks each pair of neighbors against the matrix S of the source configuration. 

Example: Yes, Ug is coupled with bone, ...No, Og is not coupled with grim. 

3. calculates the sum of all adjacent generator pairs that are connected in S and 
adds it to the overall score of the sequence; 

4. repeats steps 1 to 3 NN times and compounds a list of all different 
permutations in words with the same total score. 

It is not absolutely necessary, but Nean needs the number of cycles NN large 
enough to guarantee that all possible permutations are checked against the total score 
criterion for all selected sequences. Translating into Chemicalese, the number of 
molecular collisions should be large enough. 



91 


Important! The goal of the program is to simulate a completely random process, 
similar to molecular collisions. It does not code any intellectual activity and contains no 
algorithm other than calculating (not simulating!) the total score and counting identical 
strings. Its algorithmic part relates only to packaging the data, not generating them, 
because no algorithm exists for a random sequence. Computer can only imitate random 
(actually, pseudo-random) numbers. The universal grammar for Nean cannot be 
learned: there is nothing to learn. Chaos is inherent in any large natural system 
such as the mind. What cannot be learned has no source and is new and autopoietic, i.e., 
self-emergent. The learnable grammar starts with the haploid Nean. 


Here is a typical output of program nean: 


cave bone give Og 
give Og bone cave 
give Og cave bone 
give bone cave Og 


number of cycles NN=5000, run time: t=0 min, 
score = 7, score matches 60, number of strings 9 


Og give bone cave Ug 

grim 

grim Ug 

bone give Og cave Ug 

grim 

grim Ug 

cave Og give bone Ug 

grim 

grim Ug 

grim Ug bone give Og 

cave 

grim Ug 

grim Ug cave Og give bone 



There are no sequences with the score over 7. The lower score input leads to a 
larger number of sequences. The total number of permutations is 6!=720. 


Conclusion (not surprisingly): A complex and confusing source leads to a 
highly degenerated and confusing output. 


Source 2. 



The only change in Source 2 is the reversal of 
bond 5: bone -> Og instead of Og -> bone. We can 
regard it a mutation in S: 

S =[1 1521;2252 2; 3161 2; 4352 2; 5 3 2 1 1; 6 
4 5 1 2;7 1 4 1 2; 8 2 4 1 2; 9 1 3 1 2; 10 3 4 1 2]; 













92 


Og give 

bone cave 

ug 

grim 

grim 

ug 

cave 

: bone give 

og 

bone give Og cave 

ug 

grim 

grim 

ug 

give 

Og cave 

bone 

cave Og 

give bone 

ug 

grim 

grim 

ug 

give 

bone Og 

cave 

grim Ug 

bone give 

og 

cave 

grim 

ug 

give 

bone cave 

Og 

grim Ug 

cave Og 

give bone 







number of cycles: NN = 5000, run time: t=0.13723 min, 
score= 7, score matches 47, number of different strings 9 

The change, as compared with Source 1 , is emphasized in bold print. 


The ambiguity, therefore, remains. Nevertheless, we notice that six out of nine sequences 
for both sources start with grim Ug, so that the initial position of the subject has a better 
chance to be generated at random. 

Source 3 


Intuitively we can guess that the ambiguity can be 
resolved by: 

1 . Introducing take or Passive, which can be done 
only through grammar or lexicon. 

2. Strengthening S^V or S~>0 bond 

We increase the strength of bond 5: 

S =[1 1 5 2 1; 2 2 5 22; 3 1 6 1 2; 4 3 5 22; 5 3 2 2 1; 64 5 1 2;7 1 4 1 2; 82 4 1 2; 9 1 3 1 2; 10 34 1 2]; 
The effect is striking: a single sequence at score 8. 

grim Ug give bone Og cave 

number of cycles: NN = 5000, run time: t=0.16667 min, 
score= 8, score matches 4, number of different strings 1 



For comparison, at score=7 we have a terrible degeneration: 















93 


Ug give 

: bone Og 

cave grim 

grim 

ug 

bone give Og cave 

Og give 

! bone cave Ug 

grim 

grim 

ug 

cave Og give bone 

bone Og 

give cave 

ug 

grim 

grim 

ug 

cave bone Og give 

bone give Og cave 

ug 

grim 

grim 

ug 

cave bone give Og 

cave Ug 

give bone 

og 

grim 

grim 

ug 

cave give bone Og 

cave Og 

give bone 

ug 

grim 

grim 

ug 

give Og cave bone 

cave grim Ug give 

bone 

og 

grim 

ug 

give bone cave Og 

give bone Og cave 

Ug 

grim 

grim 

ug 

give cave bone Og 

grim Ug 

bone Og 

give 

cave 

grim 

cave Ug give bone Og 


number of cycles: NN = 5000, run time: t=0.169 min, 
score = 7, score matches 129, number of different strings 18 

The resolution of ambiguity may come from a simplification of the source: only a 
few bonds decisively contribute to the score. Or: Don’t look around, focus! 

As we could expect, the complexity of language comes form the original 
simplicity of protolanguage. With only a few words in the source, it becomes possible to 
express some of source configurations in an ordered-chaos way in spite of the completely 
random generation. 

Source 4 

Moving backwards from complexity to simplicity, 
we leave only the strongest bonds of Source 3 in 
Source 4: 

S =[1 141 1; 2 2 4 1 2; 3 4 3 1 2; 4 3 2 1 1]; 
words= ['Ug 'Og 'bone 'give ']; 

The output is singular at a very low score: 

Ug give bone Og 

number of cycles: NN = 1000, run time: t=0.0072833 min, 
score = 3, score matches 40, number of different strings 1 










94 


Source 5 


Unless we know it for sure, there could be an 
inherent ambiguity about G2 and G3: first was 
Og, then bone or first was bone, then Og? 


Lets us make Bond 4 bi-directional: 


Ug Og give bone 
Ug bone Og give 
Ug bone give Og 
Ug give Og bone 
Og Ug give bone 
Og give bone Ug 



bone Ug give Og 
bone Og Ug give 
bone Og give Ug 
bone give Og Ug 
give bone Og Ug 


number of cycles: NN = 5000, run time: t=0.04635 min, 
score = 2, score matches 2326, number of different strings 11 

As result a whole array of word orders arises. The asterisk stands for indirect object Og. 


svo 

Ug * give bone 
* Ug give bone 
Ug give * bone 


osv 

bone Ug give * 
bone * Ug give 


sov 

Ug bone * give 
Ug bone give * 


VOS 

* give bone Ug 
give bone * Ug 


ovs 

bone * give Ug 
bone give * Ug 









95 


Source 6 

Let us introduce antagonism Ug—Og by assigning a 
negative value to bond 5. 

S = [1 141 1;2241 2;3431 2;4321 1 ; 5 2 1 -1 

2 ]; 

words= ['Ug 'Og '; 'bone '; 'give ']; 

At score =3: 



Ug give bone Og 


number of cycles: NN = 5000, run time: t=0.054 min, 

score = 3, score matches 237, number of different strings 1 


At score =2: 

Ug bone Og give 
Ug bone give Og 
Ug give Og bone 


Og give bone Ug 
bone Ug give Og 
bone Og give Ug 


number of cycles: NN = 5000, run time: t=0.056 min, 
score = 2, score matches 1259, number of different strings 6 


Now let us increase the antagonism: 

S =[1 1 4 1 1 ; 2 2 4 1 2 ; 3 4 3 1 2 ; 4 3 2 1 1 ; 5 2 1 -2 2 ]; 


Ug bone Og give 
Ug bone give Og 
Ug give Og bone 


Og give bone Ug 
bone Ug give Og 
bone Og give Ug 


number of cycles: NN = 5000, run time: t=0.044 min, 
score = 2, score matches 1277, number of different strings 6 













96 


Next, we make GIVE predominantly a function of 
two variables: 

S =[1 1 4 1 1; 2 2 4 1 2; 3 4 3 0.5 2; 4 3 2 0.5 2]; 
At score = 2.5: 

Ug give Og bone 

number of cycles: NN = 5000, run time: t=0.045 min, 
score = 2.5, score matches 217, number of different strings 1 

At score = 2: 

Ug give bone Og 
bone Ug give Og 

number of cycles: NN = 5000, run time: t=0.045 min, 
score = 2, score matches 397, number of different strings 2 



Source 7 

Basic word order. 

words= [' Ug 'break 'bone ']; 

S =[1 12 1 1; 2 2 3 1 2; 3 3 1 12]; 


Ug break bone 
Ug bone break 
break bone Ug 
bone Ug break 



number of cycles: NN = 5000, run time: t=0.027 min, 
score = 2, score matches 3309, number of different strings 4 


COMMENT: the information about S and O is preserved. The relationship between Ug 
and bone is unclear. What bone? 

Next we differentiate bond strength: 

bond 1, score 2; bond 2, score 1.5; bond 3, score 1 . 
















97 


words= ['Ug 'break'; 'bone ']; 

S =[1 1 2 2 1; 2 2 3 1.5 2; 3 3 1 1 2]; 

At score = 2.5: 

Ug bone break 
break bone Ug 

number of cycles: NN = 5000, run time: t=0.025 min, 

score = 2.5, score matches 1664, number of different strings 2 


At score = 3: 

bone Ug break 

number of cycles: NN = 5000, run time: t=0.024 min, 
score = 3, score matches 880, number of different strings 1 


At score =3.5: 

Ug break bone (now we are speaking English!) 

number of cycles: NN = 5000, run time: t=0.02 min, 

score = 3.5, score matches 819, number of different strings 1 

Note a high probability of the expression: 819 out of 5000, or 16% 

Here is another try at the distribution of bond scores. 

words= [' Ug '; 'break'; 'bone ']; 

S =[1 1 22 1; 22 3 1 2; 33 1 1.5 2]; 

At score = 2.5: 

Ug bone break 
break bone Ug 

number of cycles: NN = 5000, run time: t=0.026 min, 

score = 2.5, score matches 1669, number of different strings 2 

At score =3: 

Ug break bone 

number of cycles: NN = 5000, run time: t=0.025 min, 
score = 3, score matches 866, number of different strings 1 


98 


At score = 3.5: 

bone Ug break 

number of cycles: NN = 5000, run time: t=0.026 min, 
score = 3.5, score matches 831, number of different strings 1 

From the point of view of chemilinguistry, the above experiments are meaningless 
because they have no real linguistic context. My goal was limited to a demonstration of 
a possible tool, the application of which is up to nobody but a linguist. This 
chemolinguistic tool could be just a part of a much larger tool kit combining structural, 
thennodynamic, kinetic, and evolutionary approaches, and, most importantly, the realistic 
sources. 

A pretty close model approximation to language as natural process is the theater 
stage where the spectators can see the action and the background, hear the actors, and 
have different opinions about what is going on. To study language as natural process 
without approximation, we have to let others watch our daily and nightly life. 



99 


13. CONCLUDING REMARKS 


For conclusions one should go back to Chapter 2, Preview. Here I would add a few notes 
about the relation between chemistry and linguistics. 

I strongly doubt—together with many linguists—that we will ever be able to 
reconstruct the genesis of language as it indeed happened. We will simply have no 
evidence, unless we find, as Baron Munchausen did (Raspe, WWW, Chapter IV), some 
frozen sounds of the primeval chat preserved in the permafrost. 

What we can do, however, is to see how it could and could not have happened in 
principle and to check the theory on practical development of vocal communication 
between humans and computers, as well as between androids and, God forbid to omit, 
gynoids. What we ultimately need is the study of language as a natural process and not 
just an insightful computer simulation or the intellectual game of “why not” and “what 
if.” This is where chemistry and linguistics find themselves in the same naturalist society. 

As recent American political history shows, we can reach remarkable heights with 
surprisingly limited language skills. Besides, eloquence does not come from the 
knowledge of linguistics. A reasonable question from am outsider is: Why do we need 
linguistics at all? 

There is more to language than utility: like air, sea, and mountains, it is a beautiful 
and delightful medium for humans. It is also the starting point of their mating rituals and 
the endpoint of the relationships. As observation and study of nature, the study of 



100 


language will always kindle the interest of lay people and, probably, even of the future 
androids and gynoids. 

Reading the recent overview of the problem of genesis (Language Evolution, 
2003), a natural scientist could feel some professional gratification: linguistics is 
becoming an exciting natural science. Moreover, as one can feel especially from the 
essay of James R. Hurford (Hurford, 2003), it is part of a much larger paradigm shift or, 

I would say, transition state, in science as a whole. Regretfully, it is too little known how 
much chemistry has contributed to the large picture of the world, apart from explaining 
the most intimate mechanism of life. 

There are countless variations in linguistic literature on the theme of Humboldt: 
“the infinite use of infinite media,” see an intriguing discussion in Studdert-Kennedy and 
Goldstein (2003). This is where the sister sciences tend to go separate ways. For the 
chemist, the potential infinity of atomic combinations is of no relevance. There is a strict 
division into the existing and the hypothetical, on the one hand, and the known and the 
new chemical entities, on the other hand. The latter immediately turn into known as soon 
as their descriptions are published, but there is a daily deluge of new ones. 

In some areas of linguistics, any written or spoken sentence is as good as any 
other, regardless of whether it was repeatedly used in real life situation or not, unless it is 
“ungrammatical” from the point of view of a language maven. For the chemist, the 
existence of a chemical compound must be proved by its synthesis and isolation. 
Nevertheless, not only can we hear ungrammatical sentences all around us, but our entire 
civilization is built of the daily tide of right and wrong, heresy and orthodoxy and, as 
chemistry tells us, life itself developed from errors. 

The chemical view of the world is part of the general non-Newtonian, non- 
Einsteinian, and non-quantum (in spite of the quantum theory being the deep foundation 
of chemistry) paradigm that began to penetrate, first, sciences and then to knock on the 
door of humanities after the first works of Ilya Prigogine and the first steps of Artificial 
Intelligence. I believe that the shift took hold between 1950 and 1980. The term “science 
of complexity,” as the new area is called (Kauffman, 1993, 1995), is awkward and calls 
for zipfing, but is precise. A great course of complexity by Parwani ( 2002) is available 



101 


online. I wish we could say omnistics. Just a look at the contents will give the reader the 
true taste of omnistics: it is about everything but the string theory. 

As I would define its major attribute, omnistics is the study of objects in non- 
Euclidean spaces, namely, discrete topological spaces in which life, mind, and society 
have developed their overwhelming complexity and which we know not so much through 
instruments and gauges as through words and countable numbers. The geometry of this 
world is an open and partially renewable set of points with their neighborhoods. An 
object, including a sequence of words and an intricate idea, is represented by a sparse 
matrix. Distance in this space is quantifiable not with a tape measure but with integers 
corresponding to the minimum of elementary changes from one structure to another. The 
change of an object is a change in the matrix. It is governed either by non-equilibrium 
thermodynamics or by human intent. The open character of the matrix is incompatible 
with fundamentals of the “pre-complexity” physics and even most of mathematics, in 
whose systems nothing new can happen, although a new system can always be invented. 

I believe that Pattern Theory is a welcoming portal into this entire area, where the 
chemist feels at home and so could the linguist. The growing vocabulary of the human 
race, in spite of the constant loss, is the best evidence that novelty exists. 

Moreover, I believe that PT opens the door not only into the chemistry of 
language, but also into the chemistry of thought, i.e., the evasive and murky transition 
states of the mind. Grenander (2003) offers a look over the threshold of the mind and 
enthusiasts are welcome. 

Language itself is the portal into Everything, where we can find chemistry, a 
cookbook, and a story about the origin of the portal itself.. .if we speak the language of 
Everything. 



102 


15. APPENDIX 


15.1 Example of Chemicalese 

The structure in Figure 14.1 belongs to Ceo • The spherical molecule contains only carbon 

and belongs to the class of fullerenes, which gave birth to 
the entire area of nanotechnology. Its root morpheme fuller 
was derived from the name of the famous American 
architect Richard Buckminster Fuller who designed 
geodesic domes. The ending -ene means the presence of 
double bonds. The hexagons have the skeleton of benzene. 

Figure 15.1 C6o, fullerene. 

The double bonds are shown dark 

The nomenclature name for C6o is: 

Hentriacontacyclo[29.29.0.0. 2,14 .0 3,12 . 0 4 ’ 59 .0 5 ’ 10 .0 6 ' 58 . 0 7 ' 55 .0 8 ' 53 .0 9 ' 21 . 

q 11,20 q 13,18 q15,30 q16,28 q17,25 q19,24 q22,52 q23.50 q26,49 q27,47 

0 29 ’ 45 .0 32 ’ 44 .0 33 ’ 60 . 0 34 ’ 57 .0 35 ’ 43 .0 36 ’ 56 .0 37 ’ 41 ,0 38 ’ 54 .0 39 ’ 51 o 40 ’ 48 .o 42 ’ 46 ]hexaconta- 
1,3,5(10),6,8,11,13(18), 14,16,19,21,23,25,27,29(45),30,32(44),33,35(43), 
36,38(54),39(51),40(48),41,46,49,52,55,57,59-triacontaene [14], 

Shortcuts are used also in chemical structures, as they were used in all hieroglyphic 
systems of writing, for example, n-Bu stands for CH3CH2CH2CH2 — and A stands for 
adenine in nucleic acids and their fragments. 




103 


15.2 Examples of real-life large configurations 



Figure 15.2. World automobile trade in 1994. From Krempel (1999) 



Figure 15.3. Visitors’ traffic through Duisburg. From Krempel (WWW) 







104 


15.3 The chemical view of the world 

The chemical view of the world is very much different from that of a physicist or a 
computer scientist, but it is only recently that chemistry began to realize its own extra¬ 
chemical abilities (Bhushan, 2000). Even before the advent of computers, chemical 
analogies inspired some landmark works in sociology and social psychology. In computer 
models of modem economics, an agent looks very much as an upgraded, animated, 
educated, greedy, and optimistic molecule. What a contrast with chemistry where any 
molecule dreams only about losing its energy. 

Chemists have a simple view of complexity: it is built gradually, step-by-step. A 
large natural complexity can be built only as result of long evolutionary history. The 
reasons behind this belief are of kinetic nature: a collision of more than two particles is 
very rare. Nevertheless, complex proteins, as well as minds and societies, manage to 
assemble. The concepts of chemical mechanism and stepwise concatenation of 
transformations constitute a historical dimension of chemistry. 

Since each elementary transformation is local, the simultaneous occurrence of a 
significant number of elementary transformations is improbable. In other words, the 
history of a natural complex system is Poissonian rather than Gaussian. The Gaussian 
system, synonymous with non-locality, in which any state can, theoretically, follow any 
other, always comes to an equilibrium while the Poissonian system just drags along from 
one rare event to another, between which nothing happens, and has no final state. 

As far as social evolution is concerned, even wars, which seem to be most 
common events throughout human history, follow the Poisson distribution (Richardson, 
1993), probably, because they are usually initiated by a decision of single person of a 
limited imagination. From the modem physical point of view, partially influenced by 
chemistry, all processes in the world are local (Mack, 2001, WWW). 

There is a definite appeal in further exploration of the chemical paradigm in the 
vast area of mind, society, and language. Chemistry possesses the recursivity and 
generativity that is considered a unique property of human language noted in all general 
reviews (Calvin, 2000). 



105 


It is difficult to categorize chemistry as probabilistic because the large numbers of 
participating molecules ensure detenninism. Yet determinism is absent from the 
foundation of chemical paradigm. The chemical process, once started either by a chemist 
or accidentally, runs its course on its own through a series of random events. The 
chemical system consisting of either large or small number of molecules searches for a 
new state by random collisions between molecules, “trying on” various combinations and 
mutual orientations. While classical robots and computers follow a program created by a 
programmer, the chemical system knows only fast parallel computing based on one 
operation: drawing a random number. In human reproduction, for example, conception is 
a reaction between just two molecules and it leads to spectacular results of microcosmic 
dimension. 

Quantitatively, chemistry is mostly focused on time aspects, balance of energy, 
and irreversibility. At the same time, the meticulous, matter-of-fact representation of 
chemical events as a sequence of elementary acts, with the behavior of an individual 
molecule in the focus of attention, brings chemistry on the common descriptive grounds 
with humanities, especially, history, sociology, and even biography. In his Selective 
Affinities, Goethe (1988) was, apparently, the first to bring chemical symbolism into the 
chemistry of human relations. 

The seemingly shapeless and amorphous appearance of chemistry, which often 
disheartens non-chemists, may obstruct the view of the mental workshop of a chemist 
who uses very sharp logical and measuring instruments and exercises a complete freedom 
of imagination in dealing with immense and incompressible complexity, as well as the 
experimental rigor, to rein it in. What streamlines the chemical thinking can be 
formulated as: everything is possible, but most of the possible is improbable and what is 
probable is local. 

Chemistry shares the principles of atomism, composition, and metrics with 
Pattern Theory. Chemical systems can be regarded as systems of symbolic dynamics 
where atomic symbols combine and recombine. Therefore, we can hope to design open, 
evolving, and autopoietic (self-originated) symbolic systems within the framework of PT 
serving as a kind of meta-chemistry. In order to do that, we must preserve a certain 



106 


degree of chemical realism in the symbolic dynamics. In addition, the transition from one 
state to another must include a random component. 

Next I am going to present in a very simplified and vulgarized form some basic 
ideas of theoretical organic chemistry. 

Figure 13.1 consists of three rows (A, B, C) and three columns (I, II,III). The 
upper row of the Figure 13.1 (A) shows a typical example of a reaction mechanism taken 
from a chemical textbook and known as SN2, which means Substitution Nucleophilic 
Bimolecular. 

A linguist is as little expected to be familiar with chemical theory as a chemist 
with transformational grammar. Nevertheless, some general properties of A can be seen 
on the surface. 


R' 

„ \ 

HuJ - + It*"-*" 


r 


R 

iatrdie dial 
substrata 


R 1 -| 
NiA... c— -Y £ - 

A 

r r 

trigonal planar 
transition slate 


R’ 


Nu- 


/ 

Y 


— if + V 1 


R 

tctrolirdral 

product 




Figure 15.4 Substitution SN2 









107 


1. Except carbon C, no symbols of chemical elements can be seen. The symbols are of 
abstract nature, similarly to algebra and generative grammar. Symbols Y, Nu 
(nucleophil, negatively charged particle), R, R, and R stand for particular 
combinations of atoms. The superscript indexes 1- and 8- stand for a unit and fraction 
of negative charge, accordingly. Charged particles usually have a much higher energy / 
instability than neutral ones, but vary in stability among themselves. The double arrows 

symbolize the reversibility of the transformation: I II III . The stable states 
I and III are in an equilibrium with the transition state II. 

Structures in A are 3D. The black wedges in I and III indicate that the bond is 
oriented toward the viewer and the broken wedges indicate the bond behind the plane of 
the drawing. All the other bonds lie in the plane. 

2. The large square brackets around the transition state mean that it is in the process of 
change and is neither observable, nor stable. 

3. Carbon normally has four valences, sometimes, two. The carbon atom in the transition 
state II has five bonds, two of which are shown by dotted lines to emphasize that they are 
irregular and temporary. 

The middle part (B) visualizes the change of energy (stress, instability, 
irregularity) along the trajectory of the transformation. The initial and final states 
commonly (but not always) have somewhat different energy, so that the equilibrium is 
shifted toward the more stable state. 

The lower row (C) is a 2D pictorial metaphor of what is going on during the 3D 
chemical transformation. All the events occur in the plane. The gray circle approaches the 
hand holding the white circle. In transition state II, the deformed (stressed, irregular) 
five-finger hand is in a precarious position, holding both circles. In the final state III, the 
hand holds the gray circle, but it is already a different hand. 



108 


The theory of transition state asserts that the speed of the transformation from one 
stable state to another decreases with the energy of the transition state. The latter forms a 
“barrier” that only molecules with sufficient energy can pass. Chemists use mostly the 
tenn energy, but the words irregularity, stress, and deformation, are also used in 
discussing regarding transition states. 

The same transformation of substitution of Nu for Y, or one circle for the other 
can run through a different transition state, Figure 13.2 . It is known as SN1, 

Substitution Nucleophilic Monomolecular. 


R' 

\ 

R".C-Y 


7 


R 

tetrahedral 
substrate 

R' 

C + 

A 

L R" R 
trigonal planar 


R' 

C + 

A 

R" R 
trigonal planar 
transition state 


+ Y" 


+ Nu 


R' 

/ + 

Nu —c R" 

V 


F R 

tetrahedral 


R‘ 

Kc —Nu 

/ B 

tetrahedral 


Figure 15.5 Substitution SN1 

The initial state splits into Y and a crippled planar transition state with three 
bonds at the carbon atom (A). Next, the transition state can form a bond with either Y or 
Nu. Both can approach the transition state from either side of the plane. The 2D 
metaphor for this apparent mess is shown in Figure 12.3 

We can start with either one of the four stable configurations outside the large 
square brackets. The four-finger palm can attach to either circle, which means that all 
four stable forms exist in an equilibrium. 




109 



Figure 15.6 A metaphor of substitution SN1 

Thus, the high energy transition states in SN2 and SN1 are irregular, unstable, and 
stressed because they are charged (in the eyes of a physicist) and have an abnormal 
number of bonds (in the eyes of a chemist). 

Which mechanism takes place in reality depends on the part of reality which has 
been left out of the abstract picture: solvent, temperature, and the actual meaning of 
symbols Y, Nu, R, R , and R . By studying the connection between the conditions of 
the transformation and its mechanism, chemistry acquired its modem theoretical 
sophistication even without observing the evasive transition state. 

In short, the chemist who wants to predict or explain which alternative 
transformation will prevail in the short run, compares alternative transition states and 
gives the preference to the one with the less stressed (i.e., most probable) transition state. 
Generalizing this principle to the level of configurations in natural systems offers a new, 
kinetic approach to the dynamics of the complex systems built on the platform of life, 
including biological evolution itself, as well as border area between mind and society 
where language resides. The kinetic principle alone would do little good if not for another 
very general principal of natural complex systems, also inspired by chemistry: the change 
in any natural complex systems at any given time is mostly local. It means that most of 
complexity of the system is never involved in the change. This principle is well known to 
the historians of revolutions and, I believe, is applicable to language. 

It is not accidentally that I selected human hands for a metaphor of substitution in 
chemistry. Molecules that, like hands, have no symmetry, possess chirality (handedness): 




110 


they are mirror reflection of each other. This property is of cardinal importance in 
biochemistry. The conclusions about the invisible transition states were drawn by 
chemists basing on the chirality of products. In SN2, a left-handed initial state reverses its 
chirality, while in SN1, the same state turns into a mixture of right and left final states. 

A very similar method of oblique observations on utterances, leading to 
conclusions about unobservable thoughts, was first applied by no one but Zigmund Freud, 
a chemolinguist of a kind. 


15.4 Program nean 


% PROGRAM nean 
% input: NN (number of 
% example S =[115 
% 64 5 12*7 14 12* 
% example: words= [ 'Ug 
']; 


cycles), score (total 
21; 22522; 316 
82412; 91312; 
'; 'Og ' ; 'bone 


score), S, words 

12; 43522; 52311; 

10 3 4 1 2 ]; 

; 'cave '; 'give '; 'grim 


LW=length(words(:,1)); LS=length(S(:,1)); LWS=LW-1; 

E= 0; A=[]; 

ns=0; %ns: number of selected strings 

nw=0; %nw: number of different string in the output 

tic 

for n=l:NN;E2=0 ; 

p = randperm (LW); %random sequence of words 
DP=zeros(1,2);DS=zeros(1,2); 

E2=0 ; 

for i=l:LWS; DP=[p(i),p(i+1)]; %pair of neighboring words 
for j=l:LS %number of doublets 


SD=S(j,:,:); DS= [SD(2),SD(3)]; %pair of generators in memory 
DSR=[SD(3),SD(2)]; 


if ((DS == DP)|((S(j,5)==2)&(DSR==DP))), E2=E2+S(j,4); end 
%compared with the pair of G in permutations 
% score added to the total 

end 

end 


%if E2>E, E=E2; E 

if E2==score, p; A=cat(1,A,p);ns=ns + l; end 

end, 

n;A=unique(A, 'rows' );l=length(A(:,1)); 

W= [ ] ; 

for w=l:l, W=A(w,:);WW=[]; 


Ill 


for v=l:LW, 

WW= cat(2,WW,words(W(v), : ) ) ; 
end, 

disp ([' ' ,WW]);nw=nw+l; 

end 

t=toc; t=t/60; 
disp ( ' ' ) ; 

disp ([ ' number of cycles: NN = ' ,int2str(NN) , 

,num2str(t,2), ' min, ' ]); 

disp([' score= ', num2str(score,2), ', score matches 
number of different strings ', int2str(nw)]) ; 


run time: t=' 

, int2str(ns), ', 


112 


REFERENCES 

See also http://spirospero.net/complexitv.htm 

Adami, Christoph. 1998. Introduction to Artificial Life. New York, NY: Springer. 
Aitchison, Jean, 1881. Language Change: Progress or Decay? New York: Universe 
Books. 

-. 1996. The seeds of speech: Language origin and evolution. Cambridge: 

Cambridge University Press. 

Ashby, W. Ross. 1960. Design for a Brain: The Origin of Adaptive Behavior, 2nd Ed., 
New York: Wiley. Originally published in 1952. 

-. 1964. An Introduction to Cybernetics, London: Chapman & Hall. Originally 

published in 1956. 

A short gracious bio of Ashby: http://www.isss.org/lumashby.htm) 

Baker, Mark C. 2001. The Atoms of Language. New York: Basic Books. 

Baker, Mark and Travis, Lisa. 1995. WWW. Mood as Verbal Definiteness in a 
“Tenseless ” Language 

equinox.rutgers.edu/people/facultv/ours/baker/mohawk-mood-prt.ps 

Bhushan, Nalini and Rosenfeld, Stuart, Editors. 2000. Of Minds and Molecules: New 

Philosophical Perspectives on Chemistry. Oxford, NY: Oxford University Press. 
Bickerton, Derek. 1981. Roots of Language. Ann Arbor: Karoma Publishers. 

-. 1990. Language & species. Chicago : University of Chicago Press. 

-. 1995. Language and Human Behavior. Seattle: University of Washington Press. 

-. 2003. Symbol and Structure: A Comprehensive Framework for Language 

Evolution. See: Language Evolution, 2003, pp. 77-93. 

Billard, Aude. 2002. Imitation: a means to enhance learning of a synthetic proto¬ 
language in an autonomous robot. In Imitation in Animals and Artifacts 
(Dautenhahn K. andNehaniv, C.L., eds), pp.281-310, Cambridge, Mass: MIT 
Press. 









113 


http://asl.epfl.ch/aslInternalWeb/ASL/publications/ uploadedFiles/ABCvberneticsSvstems2QQ0.pdf 

Bonabeau, E., Dorigo, M., and Theraulaz, G. 1999. Swarm intelligence: from natural 
to artificial systems, New York: Oxford University Press, 1999. 

Bottero, J. 1992. Mesopotamia: Writing, Reasoning, and the Gods, Chicago: 

The University of Chicago Press. 

Bourbaki, Nicolas. 1968. Elements ofMathematics: Theory of Sets, Boston: Addison- 
Wesley, originally published by Hermann (Paris), 1968, p.259-382. 

Calvin, William H. and Bickerton, Derek. 2000. Lingua ex Machina: Reconciling Darwin 
and Chomsky with the human brain. Cambridge, Mass.: MIT Press. 

Cangelosi A. and Parisi D., Editors. 2002. Simulating the Evolution of Language. 
London: Springer. 

Cangelosi, A., A. Greco and S. Harnad (2002). Symbol Grounding and the Symbolic 
Theft Hypothesis. In Cangelosi and Parisi (Eds. 2002), pp. 191-210. 
Cavalli-Sforza, Luigi Luca. 2000. Genes, People, and Languages. New York: North 
Point Press. 

Chomsky, Noam. 1997. Language and Mind: Current Thoughts on Ancient Problems 
(part I). Paper presented at Univ. de Brasilia, Nov. 25, 1996. Published in: 
Universidade de Brasilia, Pesquisa Lingidstica 3(4), 1997. 
http://www.linguistik.uni-bonn.de/Download/nc-pap.doc 

-. 2002. On Nature and Language, Cambridge: Cambridge University Press. 

Corballis, Michael C. 2002. From hand to mouth: the origins of language. Princeton: 
Princeton University Press. 

Dawkins, Richard. 1989. The Selfish Gene. New Edition. Oxford, New York, etc.: Oxford 
University Press. 

De Landa, Manuel. 1998. A Thousand Years of Nonlinear History, Cambridge, Mass.: 

The MIT Press. 

Devlin, Keith. 2000. The Math Gene: How Mathematical Thinking Evolved and Why 
Numbers Are Like Gossip. New York: Basic Books. 

Dunbar R.I.M. (1996/98) Grooming, Gossip, and the Evolution of Language. London: 

Laber & Laber/Cambridge, Mass.: Harvard University Press. 

Ege, Seyhan. 1989. Organic Chemistry. Lexington, Mass.: D.C.Heat and Company. 





114 


Eigen, Manfred and Winkler, Ruthild. 1993. Laws of the Game. Princeton, New Jersey: 

Princeton University Press. Originally published in Germany in 1965. 

Eigen, Manfred. 1971. Selforganization of Matter and the Evolution of Biological 
Macromolecules, Die Naturwissenschaften, 58, 465-522 (1971). 

- 1977, 1978. The Hypercycle, ibid, 64, 541-565 (1977), 65, 7-1 (1978). 

Eigen, Manfred and Schuster, Peter. 1979. The Hypercycle - A Principle of 
Natural Self-Organization, Berlin: Springer-Verlag. 

Eldredge, N. 1985a. Time Frames. New York: Simon and Schuster. 

_. 1985b. Unfinished Synthesis. New York: Oxford University Press. 

Eldredge, N. and Gould, S.J. 1972. Punctuated equilibria: an alternative to phyletic 
gradualism, in Models in Paleobiology. T.J.M. Schopf (ed.). San Francisco: 
Freeman, Cooper, pp. 82-115. 

Eyring, H. and Polanyi, Z. 1931. Phys. Chem. B 12, 279 (1931). 

Fodor, J.A. 1976. The Language of thought, New York: Thomas Y Crowell. 

Goethe, J. W. v. 1988. The Sorrows of Young Werther. Elective Affinities. Novella. 

Goethe’s Collected Works, Vol. 11. New York: Suhrkampf Publishers, Inc. 
Greenberg, Joseph H. 1990. “The Patterning of Root Morphemes in Semitic” In: On 

Language: Selected Writings of George H. Greenberg,. Editors: Keith Denning and 
Suzanne Kemmer. Stanford: Stanford University Press, p. 365. 

Grenander, Ulf. 1976. Pattern Synthesis. Lectures in Pattern theory. Volume 1. New York: 
Springer-Verlag. 

-. 1978. Pattern Analysis. Lectures in Pattern theory, Vol. II. New York: Springer. 

-. 1981. Regular Structures.Lectures in Pattern theory, Vol. III. New York: Springer. 

-. 1993. General Pattern Theory. A Mathematical Study of Regular Structures, 

Oxford, New York: Oxford University Press. 

-. 1995. Elements of Pattern Theory. Baltimore: Johns Hopkins University Press. 

-. 2003. Patterns of Thought. www.dam.brown.edu/ptg/REPORTS/mind.pdf 

Halloran, John A. WWW. The Proto-Sumerian Language Invention Process. 
http://www.sumerian.org/prot-sum.htm . 

Hammond, George S. 1955. A Correlation of Reaction Rates. 'Hammond Postulate.' 











115 


J. Am. Chem. Soc. , 77, 334. See a historical sketch by Amanda Yarnell: 
http://pubs.acs.org/cen/science/812Q/print/8120sci2.html 

Hamad, Stevan. 1990. The Symbol Grounding Problem. Physica D 42: 335-346. 

http://eprints.ecs.soton.ac.uk/archive/00008175/01/sgproblem 1 .html 
Hauser, Marc D. 1996. The evolution of communication. Cambridge, Mass. : The MIT 
Press, cl996 

Hoede, C. 2003. University of Twente. The Netherlands. Department of Applied 

Mathematics. Memorandum No 1682. Basic notions in mathematics: On the 
’’graph ” in particular and on ontology in general. 
http://www.math.utwente.nl/publications/2003/1682.pdf 
Hurford, James. R. 2003. The Language Mosaic and its Evolution. See Language 
Evolution, 2003, pp. 38-57. 

IUPAC. 1993. IUPAC, Commission on Nomenclature of Organic Chemistry. A Guide to 
IUPAC Nomenclature of Organic Compounds (Recommendations 1993). Oxford: 
Blackwell Scientific publications, http://www.acdlabs.com/iupac/nomenclature/ 
Jackendoff, Ray. 1992. Languages of the Mind: Essays on Mental Representation. 
Cambridge, Mass: The MIT Press. 

-. 1987. Consciousness and the computational mind, Cambridge, Mass: The 

MIT Press. 

Jaffe. 2000. Jaffe, Charles, D. Farrelly, D., and T. Uzer, 2000. Transition State Theory 
without Time-Reversal Symmetry: Chaotic Ionization of the Hydrogen Atom. 
Physical Review Letters, 84, 610-613. 

Johnson, Mark. 1987. The body in the mind : the bodily basis of meaning, imagination, 
and reason . Chicago : University of Chicago Press. 

Kauffman, S. (1993). The Origins of Order: Self-Organization and Selection in evolution. 
New York, Oxford: Oxford University press 

-. (1995). At Home in the Universe: The Search for Laws of Complexity, New York: 

Oxford University Press. 

Kirby, Simon and Christiansen, Morten H. 2003. From Language Learning to Language 
Evolution. See: Language Evolution (2003, p.274). 








116 


Komarova, Natalia L. and Nowak, Martin. A. 2003. Language, Learning, and 
Evolution. See: Language Evolution (2003), pp.316-337. 

Krempel, Lothar and Plumper, Thomas. 1999. International Division of Labor and 
Global Economic Processes: An Analysis of the International Trade in 
Automobiles. Journal of World-Systems Research, Vol V, 3, 487-498. 
http://csf.colorado.edu/jwsr 

Krempel, Lothar. WWW. A Gallery Of Social Structures. 
http://www.mpi-fg-koeln.mpg.de/~lk/netvis.html 

Kuhn, Thomas. 1962. The Structure of Scientific Revolutions. Chicago: University of 
Chicago Press. 

Lakoff, George and Johnson, Mark. 1980. Metaphors we live by. Chicago : University 
of Chicago Press. 

Lakoff, George. 1987. Women, fire, and dangerous things : what categories reveal about 
the mind. Chicago : University of Chicago Press. 

Language Evolution. 2003. Edited by Morten H. Christiansen and Simon Kirby. Oxford: 
Oxford University Press. 

Language Origin. WWW. Language Evolution and Computation Bibliography, 

Web Portal maintained by Jun Wang at University of Illinois at Urbana- 

Champaign. http://www.isrl.uiuc.edu/~amag/langev/index.html 

Full reference: http://www.isrl.uiuc.edu/-amag/langev/full reference.html 

Laszlo, Pierre. 1997. Belaboring the obvious: Chemistry as sister science to linguistics. 
Conference Language as an analogy in the natural sciences, Munich, November 
20-23, 1997. http://www.pierrelaszlo.net/science writinqs/lanquaqe.htm 

Lieberman, Philip. 1998. Eve spoke : human language and human evolution. New York: 
W.W.Norton. 

Lightfoot, David. 1999.The Development of Language: Acquisition, Change, and 
Evolution. Mass, and Oxford: Balckwell Publishers. 

Liu, Xiaodong. 2002. The Chemistry of Chinese Language. Ph.D. thesis. University of 
Twente, The Netherlands. www.ub.utwente.nl/webdocs/tw/l/t0000022.pdf 


Lucretius Cams, T. (1958). De Rerum Natura. Of the Nature of Things / 








117 


translated into English verse by William Ellery Leonard. New York : Heritage 
Club, http://www.gutenberg.net/etext/785 

http://classics.mit.edu/Carus/nature things.html 

Mack, Gerhard. 2001. Universal Dynamics, a Unified Theory of Complex Systems. 

Emergence, Life and Death. Communications in Mathematical Physics, 219 , 

No.l, (141 - 178). 

-. WWW. Web links to other works: http://lienhard.desy.de/ ; 

http://lienhard.desy.de/call.shtml7sv 1; http://lienhard.desy.de/call.shtml7sv 3 . 

Parwani, Rajesh R. WWW. 2002. Complexity. A course. 

http://staffscience.nus.edu.sg/~parwani/cl/book.html 
Pauling, Linus (1954). Modern structural chemistry. Nobel Lecture, December 11, 1954. 

www.nobel.se/chemistry/laureates/ 1954/pauling-lecture.pdf 
Peirce, C.S. (1992). The essential Peirce,Vo 1. 1 (1867-1893). N.Hauser & C. 

Kloesel (Eds.). Bloomington, IN: Indiana University Press. 

Pinker, Steven. 1994. The Language Instinct. New York: William Morrow. 

-. 2003. Language as an Adaptation to the Cognitive Niche. In: Language 

Evolution, 2003. 

Pinker, S. and Bloom, P. 1990. Natural language and natural selection. Behavioral and 
Brain Sciences 13 (4): 707-784. 

Prigogine, Ilya and Stengers, Isabelle. 1984. Order out of Chaos . New York: Bantam. 
Also: Nicolis, G. and Prigogine, I. 1989. Exploring Complexity. New York: 
W.H.Lreeman. Stengers, I. and Prigogine, I. 1997. The End of Certainty : Time, 
Chaos, and the New Laws of Nature, New York: Tree Press. 

Quillian, M.R. 1968. Semantic memory. In: Minsky, M. (Ed.) Semantic Information 
processing. Cambridge, Mass: MIT Press 227-270 
Rosen, Robert. 1991. Life Itself. New York: Columbia University Press. 

-. 2000. Essays on Life Itself. New York: Columbia University Press. 

Raspe, Rudols Eric. WWW. The Surprising Adventures of Baron Munchausen. Published 
in 1895 (first appeared inl785). 

http://homepage.ntlworld.eom/forgottenfutures/muneh/munch.htm#ch6 











118 


Also at Project Gutenberg: 

http://onlinebooksJibrarv.upenn.edu/webbin/gutbook/lookup7nunF3154 
Ratnaparkhi, Adwait. 1998. Maximum Entropy Models for Natural Language Ambiguity 
Resolution. Dissertation. University of Pennsylvania. 
http://www.ai.mit.edU/courses/6.891-nlp/READINGS/adwait.pdf 
Richardson, Lewis Fry. 1993. Collected Papers of Lewis Fry Richardson. Edited by 
Oliver M. Ashford, et al. New York: Cambridge University Press. 

Shannon, C.E. 1948. A mathematical theory of communication. Bell Sys. Tech. Journal, 
27, 1948. 

Searls, David B. 1992. The linguistics of DNA. American Scientist, 80 (6), 579-591. 
Senyak, Josh. WWW. Shannonizer , http://www.nightgarden.com/shannon.htm) . 

Sowa, John F. 2000. Knowledge Representation: Logical, Philosophical, and 

Computational Foundations. Pacific Grove, CA: Brooks Cole Publishing Co. 
-. WWW. John F. Sowa’s web site http://www.jfsowa.com 

Spinoza, B. de. (1994). A Spinoza Reader : the Ethics and other works . 

Princeton, N.J. : Princeton University Press. 

Stasko, John. T. WWW. Personal Page. Georgia Institute of Technology. 

http://www.gvu.gatech.edu/gvu/people/faculty/iohn.stasko/ 

Studdert-Kennedy, Michael and Goldstein, Louis. 2003. “Launching Language: The 
Gestural Origin of Discrete Infinity.” In Evolution of Language, pp. 235-254. 
Tarnopolsky, Yuri. 2003. Molecules and Thoughts: Pattern Complexity and Evolution 
in Chemical Systems and the Mind . WWW: 
www.dam.brown.edu/ptg/REPORTS/MINDSCALE.pdf 

Or: http://spirospero.net/MINDSCALE.pdf 

-. 2003. Transition States in Patterns of History. WWW. 

http://spirospero.net/HistMathl.pdf 
Terrace, H.S. 1970. Nim. NY: Knopf. 

Thom, Rene. 1975. Structural Stability and Morphogenesis: An Outline of a General 
Theory of Models. Reading, Mass.: W. A. Benjamin, Inc. 

Tikki, WWW. Tikki Tikki Tembo. Chinese folk tale. 













119 


A. http://www.pitt.edu/~dash/tikki.html , B. http://www.fairrosa.info/disc/tikki.html 
Tomasello, Michael. 2003. On the Different Origins of Symbols and Grammar. See: 
Language Evolution (2003), pp. 94-110 . 

Uriagereka, Juan. 1998. Rhyme and Reason: An Introduction to Minimalist Syntax. 
Cambridge, London: The MIT Press. 

Varshavskaya, Adamina. 2002. Behavior-based early language development 

on a humanoid robot. In Proc. of the 2nd Int. Conf. on Epigenetics Robotics, 
pages 149-158. 

Viswanathan, G.M. et al . 2002. Levy flight search patterns of wandering albatrosses. 

Nature 381, 413 - 415 (2002). Other authors: V. Afanasyev, S. V. Buldyrev, E. J. 
Murphy, P. A. Prince & H. E. Stanley 

Vlasov, L. and Trifonov, D. 1970. 107 stories about chemistry. Transl. from the Russian 
by D. Sobolev. Moscow: Mir http://www.todayinsci.com/stories/story020.him 
Waddington, C. 1957. The Strategy of the Genes. London: Allen and Unwin. 

Wigner, Eugene. 1961. The probability of existence of a self-reproducing unit. In The 
Logic of Personal Knowledge (E. Shils, ed.) Glencoe, Ill: Free Press. 

Willems, Marc. 1993. Chemistty of Language; A Graph-theoretical Study of Linguistic 
Semantics. Ph.D. thesis. University of Twente, The Netherlands. 

Zhongwen. http://zhongwen.com/ . See also: 

http://www.chinaknowledge.de/Literature/script.htm#top 
Zipf, George K. 1949. Human Behavior and the Principle of Least Effort: An 

Introduction to Human Ecology’. Cambridge, Massachusetts: Addison-Wesley. 

-. 1965. Psycho-Biology of Languages . Cambridge, Mass: MIT Press. (First 

edition: Houghton-Mifflin, 1935). W.Li compiled a comprehensive bibliography 
on Zipfs Law at http://linkage.rockefeller.edu/wli/zipf/ . 









Pattern Theory 

and “Poverty of Stimulus” argument in linguistics 


Yuri Tarnopolsky 
(2004) 


Abstract 

This paper continues the examination of language as a quasi-molecular system 
from the point of view of a chemist who happens to ask, “What if the words were 
atoms?” Any new word in the vocabulary must be at some time heard or read in 
order to be acquired. The new word practically always comes linked either with 
an observed image or with a few other words in a phrase or discourse, otherwise it 
would be meaningless at the first encounter. This halo of selective connections 
makes the morphemes and words recognizable as generators of Pattern Theory 
(Ulf Grenander), i.e., as atomic objects possessing a certain structure of potential 
bonds with preferences for binary coupling. In this way the word is typically 
acquired with a fragment of grammar. Metaphorically speaking, the generators of 
language carry bits of grammar on their bonds like the bees carry pollen on their 
feet. Regarding Pattern Theory as meta-theory for atoms and words, parallels 
between linguistics and chemistry are discussed. 



2 


CONTENTS 


Introduction 3 

1. Sets and Order 5 

2. Laws of grammar and laws of nature 7 

3. From sets to generators 10 

4. Patterns 15 

5. Notes on notation 21 

6. Bonding 26 

7. Acquisition of generators 29 

8. Bond space 30 

9. Acquisition of bond values 33 

10. Locality 36 

11. Some examples 37 

12. Language and homeostasis 41 

Conclusion 44 

References 46 

APPENDIX: The Chemistry of The Three Little Pigs 49 



3 


Introduction 


The idea that children acquire their native language in spite of the lack of either direct 
instruction or sufficient number of correct or correcting samples goes back to Plato. 
Starting with this well seasoned “poverty of stimulus” premise, Noam Chomsky 
postulated the existence of an innate universal grammar (UG), and the entire theory 
became two postulates, one on the shoulders of the other. Further postulates about the 
nature of UG (for example, principles and parameters) had to be added to the increasingly 
unstable cheerleader pyramid, so that the issue became complicated and hotly debated. 
Any general course of linguistics, as well as the Web, reflects the war of the words over 
the tiny piece of intellectual land [1]. 

It seems strange that the problem of language acquisition exists at all. Language is 
a notation of thought. Then why is mastering notation separated from acquiring 
knowledge, logic, and mastering communication with the world? A possible reason is 
that we hear what children say but do not see what is going on in their minds. 
Circumventing this very large and complicated issue, I attempt to look at the bottom 
postulate of the disputed paradigm: the poverty of stimulus. This unaffiliated paper 
continues the examination of language as a quasi-molecular system from the point of 
view of a chemist who, inspired by Mark C. Baker [2], happens to ask, “what if the words 
were atoms?” The paper serves as an addendum to [3], without which some loose ends 
will hang in the air. 



4 


Speaking and writing is the manifestation of life we all engage into with visible 
and audible output. Why should a chemist’s opinion matter more than anybody else’s? 

It is not universally remembered that chemists long ago invented, with the 
purpose of communication, the language of chemical nomenclature, which converts a 
non-linear object—molecular structure—into a string of words. The parallel between 
DNA and text was captured right at the birth of molecular biology. Chemistry and 
linguistics share much more conceptual genes [4]. 

Chemistry may be cool but linguistics is hot. I realize that there are very few 
original ideas in this paper, but to review even major domains of enonnous linguistic 
literature is a hopeless task. Philosophic literature on the subject, from Plato to 
Wittgenstein, would alone sink my ship still in the harbor. Many references are omitted, 
especially, when ideas have been popularized and are widely spread. Leading modem 
linguists (some are mentioned in [3]), write about language with gripping virtuosity and 
passion (Examples: [9, 19]). 

There are good reasons why the linguistic literature easily overwhelms the 
chemist used to the enonnity of the chemical literature, and, I suspect, even linguists 
themselves. 

First, modern linguistics is far from the habitual for chemistry consensus. Second, 
linguistics, is still very far from reproducing the human ability of rational communication, 
contrasting with the triumph of applied chemistry which can identify (“see”) and 
reproduce (“say”) any substance from scratch. Third, many linguists examine language in 
tenns of language, while chemists examine molecules in tenns of their graphic images 
and measurable properties, using an absolute minimum of words. 

While many linguists appeal to the jury for a verdict beyond reasonable doubt, the 
chemists require a hard proof beyond any doubt. Amazingly, both deal with real and 
observable objects, which alone should clear the way for linguistics into the family of 
natural sciences. Moreover, molecules are invisible without instrumentation, while words 
can be heard and seen even by small children. 

1 have no intention to snicker at linguistics. On the contrary, 1 intend to lovingly poke fun 

at the heavy, Puritan , tedious, and down-to-earth realism of my native chemical thinking. 



5 


The chemist works like the farmer and proves his point by bringing a molecule or process 
to reproducible material existence, as if it were a pumpkin. This is, probably, too much of 
virtue for the flowery, fluid, sybaritic, and Bohemian habits of language speakers and 
tillers. But what can you expect from those who deal with mindless and speechless 
molecules? Even cows moo. And yet 1 think chemistry is one of two most romantic 
sciences on Earth. The other one is linguistics. 

Coming from a non-linguist, the opinion expressed here cannot be an argument in 
a professional discussion. It might, however, illuminate the problem from a new angle for 
both professionals and fans. Chemists think about the world in very distinctive terms. To 
demonstrate this is my major goal and part of a larger program, see [4, 5], I am interested 
in the export of chemical experience to cognitive and social sciences with Pattern Theory 
as meta-theory for all discrete complex combinatorial systems. 

As for my own language, which is not my native, 1 will use, with the ecumenical blessing 
of George Lakoff [6, 7], metaphors with all the self-indulgence of somebody in no need 
of a grant. 

All unfamiliar to non-linguists terms could be easily found on the Web. 


1. Sets and order 


If I asked only the question “ what if the words were atoms ” and stopped there, the 
answer would not go beyond a metaphor. When Mark C. Baker entitles his enjoyable 
book “ The Atoms of Language he assumes that there is more to words than their use as 
building blocks of a combinatorial Lego. I am going to encourage the timid interplay of 
two distant but related disciplines by asking the inverted question: “ what if atoms were 
words?" I believe that both questions are equally legitimate. They will guide us toward 
the realm of complex discrete combinatorial systems, which is still a little explored pre- 
Magellan world where administration and legislation overrides navigation. 



6 


Pattern Theory is the first system of mapping which shows Linguarctica and 
Chemistralia as recognizable continents made of the same firm land and surrounded by 
the same ocean. In Pattern Theory (PT), both words of language and atoms of chemistry 
are generators and their deep kinship is more than just a metaphor. The best way is to go 
to Elements of Pattern Theory by Ulf Grenander [8], which carries a great inventory of 
seeds and fertilizers for intellectual farmers. Next I will try to approach some of the basic 
ideas of Pattern Theory through the back door where ID is not asked and nobody cares 
whether you are chemist, linguist, or just a chatterbox. 

The most fundamental and unifying concept of mathematics is set: any collection 
of any elements which can be combined and recombined into new sets. Elements of the 
set have nothing but pure identities. Thus, from elements A, B, and C we can fonn sets 
{A}, {B}, {C}, {A, B}, {A,C}, {B, C}, {A, B, C}, and empty set {0}. 

The mental operation of combination is unconstrained, requires no effort, and 
the order and relation of elements in a set do not matter. There are no ties of any kind 
between the elements of a set except for being thrown together between the brackets. Set 
elements can be compared to pieces of paper with names, thrown in a bag and offered for 
drawing a lot. 

In casting a vote, when a number of voters place their ballots, some of them identical, we 
deal with the multiset (also called bag in computer science), for example, {A, A, B, 
C,C,C}, where order does not matter either. 

Strictly speaking, chemistry deals with multisets, so that when molecule A turns into B, 
there is still plenty of A around. This is not so in computers and the mind, where 
everything is represented in single copy, so that any destruction is irreversible. The 
analogy with chemistry, however, becomes striking if we note the function of memory: 
whatever happens, memory keeps hard copies, at least for a while, and so creates an 
effect of multiplicity. 

A list of names in alphabetical order is a quite different object. The elements of 
the list cannot line up freely. They must stick together in a certain way defined by their 
local properties, namely, their first and subsequent letters compared with an arbitrary 
global alphabet: A, B, C ...Z. The alphabet is just a mapping—one symbol to one 



7 


number—of the set of positive integers on the set of symbols. Such sets are ordered: for 
each two elements, one precedes the other. 

There are also partially ordered sets (posets), like the somewhat flexible list of our daily 
priorities or the hesitant subjective rating of beauty contestants. In such sets we are 
certain about the order of some but not all pairs of elements. 


2. Laws of grammar and laws of nature 


Suppose we collect a large number (corpus) of actually spoken and not just possible 
utterances and have to decide which are correct (grammatical, lawful) and which are not. 
This task is easy if we have clear criteria of correctness, or at least a corpus of definitive 
correct utterances. But what if the language purists are split on the subject and, moreover, 
the speakers do not listen to them? Then a possible solution is to select a sub-corpus of 
statistically most frequent variations with the same meaning and regard them as the 
standard against which irregularities could be measured. The purists may not agree with 
some entries, but they will be certainly satisfied with most of them. Some people, 
however, may disagree about meaning. Besides, spoken language is largely automatic, 
improvised, and heavily dependent on context, intonation, and facial expression. 

My point is that the notion of correctness is like the survival of the fittest in 
biology: it is circular and, therefore, just a mental toy. 

Note that the real man-made lists can have irregularities and the number of deviations 
from the alphabetical order per a unit of length could be an overall measure of the 
irregularity. For large collections of linear sequences (strings) of elements, whether 
speech, texts or DNA, a metrics, like Hamming distance, can be established: the 
sequences differing in one element are closer than the sequences with two discrepancies. 
This can be generalized for any complex objects. This is how a natural statistical norm 
can be captured. Statistics is not the only instrument of the linguist comparable with the 



8 


heavy duty instrumentation of the chemist. There are probes and tests very similar to 

chemical ones, like the wug-test. 

My intent in probing the laws of grammar is to compare them with the laws of 
physics and chemistry. There is no correctness or incorrectness in nature, to which 
language belongs. This is why the very idea of a natural grammar as a system of rules and 
parameters, or just rewrite rules, stored in the mind of a little child and not in a book or 
an adult mind, seems to me, a chemist, as unnatural as any alphabet. 

Let us turn from artificial (mind-driven) to natural (mindless) processes without 
external human control, which we understand much better than human matters. 

The atoms stick together for physical reasons and form more or less stable 
aggregates called molecules where the number and order of connections of the atoms of 
different kind have a decisive bearing on the individuality and behavior of the molecule. 
If the atoms were indeed words, we could say that the atoms could fonn some stable 
aggregates and resist fonning some others because they knew the grammar (i.e., rules 
and parameters) of chemistry. 

Atoms do not consult the textbook of chemistry, however, before assembling into 
aspirin. The possibly enlightening for a linguist chemical story is that they “know” the 
rules in a noteworthy manner: given indefinite time, the molecules assemble in such a 
way, that the aggregates with the lowest energy are much more abundant than those a 
notch up on the scale. This is the only universal natural rule, but there is a multitude of 
not rules, but properties of atoms, which define the actual fonn of the aggregates. There 
is a wide-spread among linguists belief that language has an infinite generating power 
(Chomsky: ’’discrete infinity”), but chemists are more cool-headed. Theoretically, all 
possible assemblies of a given set of atoms will be present after an indefinite time, but 
only a few will be in fact detectable, and even less will be prevalent. Should we say that 
the minor versions are wrong? 


Chemists, like businessmen, are not interested in indefinite time and they focus on the 
problem of relative speed of alternative concurrent transformations. The concept of 
transition state, i.e., the bottleneck of transformation, is the major, but still neglected, 



9 


contribution of chemistry to the theory of complex systems. More about that, see [5, 4, 3]. 
This, however, has little to do with language acquisition because the time of individual 
learning is negligible with the time of language genesis and evolution. 

Molecules behave because of the inherent molecular motion. Any living, not 
pinned to the display box, language is also full of motion: utterances fly around, clash, 
and scatter the sparks of fragments, not to mention processes on the historical scale. 

There are especially hot areas of professional, sub-cultural, and child language where 
new mutants are born to survive or die. Some parts of everything that has been really 
said and written during the day settle down in written or recorded form and can be used 
to study the language variations and evolution in the same way a paleontologist studies 
the evolution of birds. Alas, not much was left before writing, but tribal languages store a 
lot. 


Language, which comes in populations and perpetuates by replication, mutation, and 
exchange of material, is a form of generalized life. Contrived examples are chimeras. 
Remarkably, chemists sometimes start with chimeras in their imagination and then 
successfully synthesize them and even use for practical purposes, as nanotubes illustrate, 
but one cannot leam chemistry from nanotubes alone. 

Molecular systems and language are similar not only because they consist of 
atoms, but because they are natural dynamical systems driven by the constraints of 
thermodynamics. This reeking of hot engine oil tenn means, in fact, something very 
general and simple: there is a preferred direction of natural events, and we know what it 
is, and if we go against it, we have to pay a price in the currency of energy. Natural 
language is not an exception and this is why it is always correct, until some shock hits 
the society of speakers and the language finds itself in an uncomfortable unstable 
position on the hill slope and slides toward a new position of reduced social stress, as the 
preferred direction of natural events requires. The preferred direction of language 
evolution, recapitulated in individual language acquisition, is the optimization of 
communication as part of social life. This is how a chemist could paraphrase the perfectly 
natural idea that language is an adaptation (Steven Pinker in [9] and earlier with Bloom), 



10 


although evolutionary thermodynamics of the open systems, like life and society, is today 
fonnulated only in very general terms. An eloquent, I would say, beautiful discussion of 
the subject in non-chemical terms can be found in [9], where even the distant behind-the- 
scene voice of chemistry can be heard (Komarova and Nowak, in [9])). 

My opinion of an outsider is that truly systemic linguistics should treat language as a 
natural system with a generalized physics, chemistry, and even physiology. A similar 
direction was outlined by Christian Matthiessen and M. A. K. Halliday: “It [language] is 
a phenomenon that can be studied, just like light, physical motion, the human body, and 
decision-making processes in bureaucracies; and just as in the case of these and other 
phenomena under study, we need theory in order to interpret it” [10 ]. In [3] 1 put Joseph 
Greenberg and Noam Chomsky in the opposite comers of the linguistics Hall of Fame. 
Today 1 see, alongside Greenberg, Brian MacWinney [11], who works with language as a 
typical natural scientist. 


3. From sets to generators 


There are more probable (and, therefore, more stable —thermodynamics again) and less 
probable (and, therefore, less stable) molecules, which are sets of atoms under various 
constraints. Some atoms in a molecule are bonded, others are not, and the bonds have 
different properties. 

The mathematical image of molecular structure is a graph in which atoms are 
points selectively connected with lines. In a simple graph, the positions of the points 
(nodes) do not matter and all the connections (arcs) are the same. In the colored graphs 
the arcs, and in the labeled graphs the nodes, can be different, which makes them suitable 
to portray molecular structure. In a real molecule, atoms commonly oscillate around fixed 
positions in 3D space. 



11 


Only a few molecules are linear, but linear polymers, for example, DNA, 
consisting of molecular blocks arranged like the words in a text, are the most conspicuous 
evidence of the mathematical kinship between language and chemistry, well recognized 
by both clans. 

Figure 1 shows a few objects that can be obtained from simple sets by adding 
binary relations between elements. Some elements are circles with different fill patterns, 
to emphasize their individuality. Sets with connected elements represent what chemists 
and architects understand by structure. 


Set 


Graph Colored String 

labeled eraDh 



Alphabet 



► 


Alphabetical list 

12 3 4 



Figure 1. Evolution of sets 

The concept of mathematical structure is something different. In a way, an ideal 
grammar is a mathematical structure: it separates right from wrong. Mathematical 
structure, simplistically, consists of terms, axioms, and operations, so that one can say 
which result of operations is right and which is wrong. Algebraic structures are a set of 
elements, a list of axioms, and one or more operations that turn two elements into a third. 
Obviously, this is what chemistry and linguistic do by connecting one block with another. 
This is why algebraic structures are in the foundation of mathematical linguistics [12]. 
The belief in the infinite language, probably, feeds on the computational idea of language 
as an obviously infinite set of strings generated by the operation of concatenation. 




12 


Generators and configurations also form an algebraic structure. From this point of view, 
algebraic structure and “structural” structure conceptually converge: it is all about binary 
operations. Chemical structures, however, have a lot of additional physical constraints 
that are beyond algebra. 

Let us take an alphabetical arrangement: LIST = Andy—Bob—Pat—Ronny— 
Zelda. It is ordered according to the Latin alphabet: 

A 1 ... Q 17 

B 2 O 15 R 18 Z 26 

C 3 P 16 S 19 

LIST is a very simple artificial object, a result of my mental activity. If the words 
were atoms, it would be reasonable to ask how LIST could originate from names without 
human participation. 

We can describe the petite LIST in deceptively complex terms as a 
configuration of elemental objects: generators g from the generator space G. 
Generators are a kind of abstract atoms that can self-assemble, at least in our mind, into 
regular ( lawful) configurations. Two examples of are shown in Figure 2. 



Figure 2. Generators of LIST 

Each of the objects has an identifier g, i.e., label (name) and two bonds with 
coordinates j, labeled as L(eft) and R(ight). The bonds have numerical bond values P 
which are simply the positions of the first letter of the name in the alphabet. It would be 
reasonable, however inconvenient, to index p and J with the generator name: 





Let us apply the following rules to the behavior of these objects: 


For gj and g 2 , {gi , g 2 }cG (i.e., for two generators from space G) 

p= TRUE if p^ 2 > Pr 

Here we have the bond relation p that depends on the bond values P of two 

generators g. If p= TRUE , the two generators can be neighbors in the list. If FALSE, 
they are not fit to rub shoulders. 

Note the local character of the rule. In order to check a compliance with the rule, 
only an examination within a tiny area of the string is necessary. The entire “behavior” of 
generators is local: the events consisting of acts of locking and unlocking (no, words local 
and lock have different origin) do not happen at a distance. Otherwise it would require a 
homunculus to control it. This property is important for any process of genesis of a 
complex system without a complex controlling mind. Natura non facit saltum, from this 
viewpoint, means that nature has neither mind nor algorithm, nor Random Access 
Memory to romp around all over. We can safely assume that humans started the language 
from scratch and were not taught by other beings how to deal, for example, with 
anaphora, i.e., the mental jump from the noun to its respective pronoun and back while 
we speak. It is only natural that humans are teaching the computers with RAM and ROM 
to do this trick and not to be lost in multiple nouns and pronouns. The simpleminded 
children acquire their simple pre-school language without a tutor because the acts of 
acquisition are local and, therefore, do not require a thinking mind. Later they start to 
chew on the bitter roots out of which the minds grow. 

The word behavior adds a flavor of spontaneous activity to generators. 

Metaphorically speaking, we take a handful of generators g from the bag G and shake 

them in a box, so that they could stick to (or repel) each other according to their 
preferences, like the frequenters of a singles bar, only somewhat kinkier. Note, that 



14 


neither algorithm nor human control is needed for the spontaneous self-assembly of the 
above generators into a string—nothing but some random “shaking.” We have to supply 
motion as the source of chaos, but a series of earthquakes over a million years would be 
as good for the purpose of shaking a box. 

The physical flavor that 1 attribute to generators is not to be found in Pattern Theory, 
although it can be easily inferred. I am taking some chemical liberties with mathematics. 
Nevertheless, in some sense, defined in PT, generators selectively accept (or repel, I 
would add) each other with varying enthusiasm. There is an acceptor function for the 
mutual affinity of two generators. 

In a more general case, the bond value can be any number, regardless of any 
alphabet, and p could be a function of the two contacting (3, so that for some pairs (bond 
couples) p is high (i.e., very true) and for others low or negligible (i.e., very false). Not 
only that, but probabilities or additive weights (not related to connectionist learning) can be 
attributed to the generators themselves, as in topicalization, i.e., putting the most 
important word first (fronting), and marking it either vocally or, in languages like 
Japanese, with a morpheme, in addition to fronting. 

Finally, we would be able to calculate the probabilities of a string of symbols or a 
list of names by multiplying the probabilities of bond couples and the “weight” of names. 
We can also express the mutual affinity of generators and their weight in terms of 
generalized energy, which, unlike probability, is additive over the increments of bond 
couples. As result, we arrive to a kind of generalized chemistry of symbols, if not of 
everything. In PT it is called Pattern Synthesis. 

Pattern Theory is about probabilities on structures. The system of this kind 
generates thoughts in Ulf Grenander’s GOLEM [13], which, in my view, is an initial 
model of meta-chemistry where generators combine into configurations under 
predominantly local rules. 

To make the next step toward PT , we need to further generalize generator by 
lifting the limit on the number of bonds and allowing non-linear configurations. They are 



15 


never present in speech, but nobody has ever managed to do syntax or semantics without 
either a tree or a Russian doll of brackets. 



Figure 3. “Generic” generator 

A generic portrait of an atom of everything is presented in Figure 3. It can have 
more than two bonds and can form a great variety of configurations, typically, non-linear. 
Abundant illustrations from large number of areas of knowledge can be found in [8] and 
more special and technical works in PT, where numerous shapes are analyzed and 
produced from a generator space and a minimum or no global constraints. 


4. Patterns 


The last key step from configurations to patterns is simple: pattern is a class of 
configurations. In PT it is a similarity transformation that generates one configuration 
from another within the same class. Regularity of a configuration, which plays the role 
of mathematical structure axioms, consists of generator space G , bond relations p, 





16 


similarity transformation S, and the type of connector X, for example, LINEAR or 

TREE . Regularity can be strict or relaxed. 

Since major applications of Pattern Theory are designed for processing two- 
dimensional images, similarity transformations can often be expressed analytically in the 
fonn of equations, for example, for stretching, rotation, warping, etc. , or by non-trivial 
algorithms. In the discrete space of linguistics this is hardly possible. A similarity 
transformation can be simply a change of the word within a logical, morphological, 
syntactical, or, as in poetry, just phonetic category. The prototype of all similarity 
transformations in linguistics is the Laputian machine, witnessed by Gulliver, in which 
letters were permuted at random to produce a word of new knowledge. 

Another way to describe the pattern is to formulate not what the similarity 
transformation changes but what it leaves unchanged. 

Because of the historical origin of language patterns, it is not always possible to 
explicitly fonnulate the similarity transfonnation and it can be defined as just a list. 
Languages with genders (Russian, Gennan) and noun categories (Bantu group) are of this 
type. What seems a logical aberration, like the neutral gender of girl in Gennan (das 
Madchen) must be just memorized by rot. There must be some evolutionary logic. 

An interesting channel opens between PT and the domain of categorization in 
linguistics, with theory of prototypes (Eleanor Rosch [14]), inspired by Ludwig 
Wittgenstein’s concept of family resemblances. The template of PT, which is a typical 
configuration, seems to be the exact counterpart of the prototype in linguistics. Pattern 
Theory was also inspired by Wittgenstein. The realistic aspirations of systemic and cognitive 
linguistics echo the realistic spirit of PT, still missing, as it seems, the heart of the matter 
with all natural systems: measure. 

Since I am interested here only in what is necessary to test the Poverty of 
Stimulus Argument, I shall refer the reader to PT for more detail. I have only a few 
strictly personal remarks to be made on part of a chemist and not-mathematician. 



17 


A generator space G , i.e., a set of generators with their properties, defines a 
combinatorial configuration space Cq • Formally, it does the same job as the Cartesian 
coordinate system, which defines all possible points of Euclidean space, or the 
Chomskian principles and parameters, which, ideally, define all possible grammars, or 
the questions of the US Census, which define the essential profile of the US population. It 
does the job, however, without any explicit global list of coordinates for the 
multidimensional spaces, but by providing strictly local descriptions of generators and 
allowing some freedom of their mating. 

We can use this curious system of coordinates, so practical for immaterial discrete 
systems in non-Euclidean spaces, if we tacitly count on the mystic ability of generators to 
find each other and stick together the right way or one among many right ways, some of 
them more right than others. This is a kind of mathematics that has some properties of 
physics and chemistry. This is how I became enthralled by PT. It is potentially a calculus of all 
realistic (however immaterial) discrete combinatorial objects because it attributes a 
priori some probability to their existence. It predicts what is likely to happen and 
explains why other alternatives are less likely. 

In the case of alphabetic lists, the all-or-nothing system of rules defines all 
possible alphabetic lists and separates them from even larger space of all non-alphabetic 
ones. 

In general case, Pattern Theory can partition the set of all configurations into 
regular (by the rules) and irregular ones, but even more generally, it offers a measure of 
regularity on a continuous scale. 

The connection with linguistics can be seen here. Not only are grammatical 
structures regular and ungrammatical ones irregular, but some are more regular than 
others and others are less irregular. Moreover, there is a measure of stability, and some 
constructs are more stable—less ambiguous or difficult to understand—than others, 
provided the content of the mind is the same. Benjamin Worf s idea, as I understand (or 
misunderstand) it, was a relation between language and the content of the mind, which is 
defined at least partly by the environment. 


Here are two color-coded examples. Examples are from [15] 



18 


The lines mean: 

1. A Hungarian phrase. 

2. The Hungarian phrase segmented into morphemes. 

3. My interpretation, in a kind of pidgin, of the meaningful words and morphemes. 
The colors match the first line. 

4. Linguistic glosses, i.e. meanings of words and morphemes, using abbreviations. 

5. English translation 

6. Literal translation 


Example 1 


1. A szobaban iilo gyerekek jatszanak 


2. _a szoba-ban iil-o gyerek-ek jatsza-nak 

PLAY-theydo 

4. the room-INESS sit-PRSPART child-PL play-3PL 

5. ’ The children sitting in the room are playing.' 

6. Literally: The in-room-sitting children play. 


3. the ROOM-in it SIT-doing_CHILD-many_of_them 


Example 2 

1. A gyerekek a szobaban iilve jatszanak 

2. a g yerek-ek a szoba-ban iil-ve jatsza-nak 

3 the CHILD-many of them the ROOM-in it SIT-while_doing PLAY- 
theydo 

4. the child-PL the room-INESS sit-VERBADV play-3PL 


5. ' The children are playing sitting in the room.’ 

6. Literally: The children, in the room sitting, play . 



















19 


I bet most readers can understand almost any Hungarian phrase presented like the 
third line. Moreover, any language can be understood in the same way. This property of 
language is the basis for of automatic translation. I added line 6 in order to be as close to 
the Hungarian text as possible, even at the cost of “correctness.” Yet the correspondence 
between the phrases is somewhat loose. There is only one definite article a in the first 
example, but two the in the English translation. The Hungarian text does not give a clue 
regarding the English tense, because there are only Past and non-Past tense markers in 
Hungarian verbs. The Hungarian continuous tense is extinct. Instead, Hungarian has a number of verb 
forms untranslatable into English directly. 

Both examples, honestly, mean the same and can be expressed in both languages 
also in other ways, e.g., “The children are sitting in the room and playing”. 

We understand the sentence in Hungarian or Japanese, if translated into an 
artificial inter-language, because it describes a situation with which both Americans and 
Hungarians are familiar. A phrase from a textbook of microbiology would not be 
universally understood. Our language is embodied (George Lakoff) in our human 
existence. 

I have not discovered either America or Hungary with my examples. This subject 
has been intensely discussed in linguistics. I would fonnulate it this way: the patterns of 
thinking are universal because their invisible generators are identical, but the patterns of 
language are different because their visible generators are different. We master both 
because we acquire the generators bit by bit and this is all we need. Our mind is a flask 
where they assemble and our mouth pours them out. The ready patterns of adult and peer 
speech catalyze some patterns of children at the expense of others. But we pick the 
generators from different fields and they could be different berries altogether. 

As I suggested in [2], not claming any revolutionary insight either, because this is 
the very essence of language, the configuration of the internal source of the utterance is 
commonly non-linear and it must be linearized, sometimes in a tortuous way, to be 
verbalized. This where I see the essence of universal grammar, and as any essence it is 
utterly simple. 



20 


In terms of Pattern Theory, grammar is a collection of regular (not correct!) patterns 
of word/morpheme configurations. From this angle, UG is, most probably, just the innate 
ability of humans and animals to perfonn pattern analysis and synthesis, demonstrated 
not only in language but also in locomotion, perception, hunt, work, dance, rituals, play, 
investigation, politics, etc. The uniqueness of human language acquisition device, 
however, is undeniable. The speaking mind has to convert the nonlinear content into a 
linear message at one end of communication channel and reconstruct the content at the 
other end—a far cry from learning to dance or bake pizza by just watching how it is done 
and repeating the motions in the Euclidean space. 

It is my personal impression that PT plays the role of mathematical physics of 
complex combinatorial systems to which all chemistry and manifestations of life on earth 
from the life of a cell to society belong. In this area not only detenninistic equations are 
usually powerless but even the probabilistic theories get stuck in the mud for a simple 
reason: in evolution and history every global (defining) event is unique. It does not 
belong to a statistical ensemble, while local events do. It can be comfortably viewed, 
however, in local terms of breaking and interlocking bonds, which is the area of expertise 
of chemistry. 

So much for the kinship of chemistry and linguistics. What about linguistics and 
biology? Speaking not as a chemist but as an adventurer, I would say that the language in 
the fonn of the second or third lines of the above examples is, in my opinion, the closest 
we can have approximation to the most ancient pattern of tribal languages, well after 
Nean. As we had lost our tail and are now losing classical music, so English lost its cases, 
Hungarian lost its Present Continuous, and Russian its vocative case. 

What they all acquired was a great syntactic complexity of compound sentences to 
describe complex ides, situations, or just to show off, as is proper for a performance art 
such as circus. 


With similar experience, we will understand each other whether we say 



21 


ROOM-in_it SIT-doing_CHILD-many_of_them PLAY-they_do, or 


CHILD-many_of_them PLAY-they_do SIT-doing ROOM-in_it , or 


many_of_them-CHILD they_do-PLAY SIT-doing in_it-ROOM-in . 


Moreover, unless our life experience is radically different, we will understand 
each other even in Nean [2]: 


child! child play! child sit! sit room! play sit! 
child! child room! play child! play room! 


1 have an impression that we encounter difficulties with automatic translation not because 
the problem itself is complicated, but because our civilization is a real mess. What do you 
think the words universe, magma, ring, loop, variety, envelope, and signature mean? 
They all are mathematical terms. Atom and molecule are terms of prepositional logic. 1 
suspect that tribal languages in their traditional pre-technological forms are the easiest to 
cross-translate if the subject is traditional, too. 


5. Notes on notation 


Formalization in chemistry is of little value. The rules of the chemical grammar can be 
easier described than formalized, especially, for a chemist. 

1 have a subtle grudge against mathematical formalism in its dominating fonn: it is based 
on the axiom of closure, which, coming from the Aristotelian requirement of the 
permanence of the subject, means that the set of terms during the discourse remains 
unchanged. It efficiently eliminates any ability of mathematics to formalize the 
phenomenon of novelty and evolutionary invention. As far as 1 know, only Bourbaki [16] 









22 


in the concept of the scale of sets attempted to cover, albeit in a skeletal way, the unusual 
subject of novelty and, therefore, evolution. Acquisition is a particular case of evolution. 
How can you acquire something that is already in your bag? Until 1 am seriously rebuffed 
by a mathematician, 1 swear never to miss a chance of drawing attention to it. 

Although my aim here is to outline some ideas in an infonnal manner, a notation 
can help clarify them and to show how language is truly embodied in reality and what 
chemists mean when they speak their lingo. 

I use figure brackets and three other kinds of symbols: letters, simple lines, and 

special arrows => or <=> . The letters can signify sound, word, phrase, sensation, image, 
trace in memory, idea, etc. The lines are binary relations (if directed, the line turns into a 
simple arrow) , and the special arrows mean causation. The signs, including the brackets, 
suggest, in sufficiently vague tenns, that we deal with real world objects and processes 
displaying in real time and topological space. The underlying nature of these processes is 
completely beyond the scope of the current discourse. The chemical parallels, however, 
are clear: letter is an atomic object, line or arrow is a bond, and the special arrow is a 
transformation. Symbols of chemical reaction —> and are the verbs of the chemical 

language, if you will, and => and <=> , which I use outside chemistry, are also verbs of a 
kind. As I will try to show, the chemical fonnalism, while reflecting chemical ideas, 
points to much more general concepts, rising some strange questions that never bother 
either mathematicians or linguists or computer scientists. 

{A,B} means not just the set with two elements, A and B. This is more like a 
topological neighborhood: the elements of the same set in some sense are close in space 
or time. More specifically, the two elements appear together in some situation. In case of 
language acquisition, {A,B} means that A and B are words, objects, or traces in 
memory perceived or recalled within a relatively short span of attention. They are, so to 
speak, pushed by chance or intent to face each other—a common thing in psychology 
and neurophysiology. In case of a chemical process, they are on collision or within a 
close spatial range or just are dashing around in the same flask, of course, not without the 
chemist’s hand in it. 



23 


A—B means that elements A and B are bonded. AB means the same. 

A—»B may denote a directed bond, which we find in configurations of thought and 
many chemical bonds. 


{A, B } c= G , {A, B } => AB, AB => C , {A, B,C} c G , 

means a chain of transformations leading to the expansion—this is synonymous with 
acquisition—of the generator space G . First, A, B, elements of G , brought together, 
lock the bond AB. Configuration AB generates a new element C, which also enters 
G as a generator. C is the sign that denotes doublet AB and it can be used instead of 
it. This relation is reversible: AB <=> C. 

This is the tricky point where the strange questions arise. For support I can only 
turn to the authority of Bourbaki, who builds the scale of sets in this way. Nothing like 
that can really happen in nature, where matter has no double existence, but is natural in 
the mind. The mind is the second existence of the world and the language is the 
second existence of the mind. When we talk about really profound subjects, things are 
never completely clear and, as Niels Bohr once noted, opposite statements are both true. 
Only a chemist can confidently say that A and B reversibly combine into a very stable C, 
which does not exclude the existence of free A and B. For the chemist, AB and C are 
in equilibrium. For the rest, it is just a far-fetching metaphor. In cognition, however, we 
can find a more sympathetic reception: C is a sign of the category to which AB 
belongs. When we think about cats and dogs, pets are kept in mind, and if we think 
about pets, both cats and dogs pop up. 

In PT we are like fish in water: C is the identifier of a composite generator. Two 
bonded generators (doublet AB) can be regarded as a new generator and assigned a 
separate symbol (C), which does not erase either A or B from generator space. The 
bonds of C are whatever free bonds are left after AB is bolted together. 



24 


We could spend a lot of time hairsplitting over the relation between set theory 
and Pattern Theory, mathematics and the world, theory of meaning, but I am least of all 
qualified to do it. What we are talking about is a very generic and universal thing: the 
hierarchy of building blocks, with which linguists, chemists, and engineers deal every 
day. The existence of dining tables is no threat to the existence of either the table boards 
or the legs in the inventories and storages. The existence of grammar does not jeopardize 
the prosperity of either syntax or morphology. 

The chemists practically always deal with multisets in the sense that atoms and molecules 
are present in enormous number of copies. 1 believe that the ability to build a hierarchy of 
signs is as profound property of mind as physical aggregation is a property of matter, but 
symbols, signs, and shortcuts do not belong to matter. This is how 1 would formulate the 
strange problem, which a mathematician could, probably, clarify: is the scale of sets or, to 
take a simpler example, the set of all subsets (power set) a multiset? If yes, than there is a 
non-trivial set which is also a non-trivial multiset. If no, then any set of sets is always 
their union. All 1 could find was that obvious statement that the power set of multisets is a 
multiset and that in computer science it is convenient to regard a power set a multiset, but 
the references were not reliable. 

In connection with the strange problem, I would refer to memory—the crucial 
part of cognition and, more generally, life. Mathematics and the universal Turing machine assume 
an infinite memory. We recognize and/or memorize NEW and recognize or forget OLD. In 
human mind, if we do not use {A, B} anymore, it is forgotten, but C can remain, and 
vice versa. 

Formally, AB => C reminds the composite arrow of category theory (CT) [10], 
which exerts an unclear to me and some others influence on linguistics, starting with 
Chomsky. Generators, obviously, are different mathematical objects, but both objects are 
associative. 

I am determined to shun any discussion of what the terms close, appear, bond, 
stability, reasonable, result, etc., could mean: their meanings follow from their use, as 



25 


Wittgenstein believed, and this is why we, humans, sometimes cannot understand each 
other. What is important, all such terms can have measures. If close, one can ask, then 
how close? If something is stable, than is this more stable than another? If it results, 
transforms, or appears, then how fast? To ask such questions is a deeply ingrained habit 
of the chemist and the natural scientist in general. The remarkable aspect of PT is the 
ability to provide the framework for answering them regardless of the particular subject. 
Thus, approaching a speech generation problem from this typically chemical angle, we 
might decide that not the most grammatically and semantically correct, but the fastest to 
generate utterance will be produced and, probably understood in context. Similarly, in 
the social and political matters, not the most reasonable in the long run but the easiest to 
implement decisions are most often taken, falling into the range from symbolic to violent 
actions. 

To conclude this session of a self-examination of the chemical mind, I would like 
to touch the evolutionary nerve of a chemist. 

While the general principles of evolution are a separate topic, far from consensus 
and not to be discussed here, the chemist’s view of evolution is more settled. It shapes the 
overall chemical attitude toward building any complex system. In a few words, in the 
style of Poor Richard’s Almanack , it is as follows. 

1. Easy does it. Complex systems are built from the simple ones in simple steps. 

This is the most axiomatic statement from which the other two are partly 
deducible. 

2. Rome wasn’t built in a day. Therefore (see 1), the building of a complex 

system starts from a simple system. 

3. Do not change horses in the middle of the stream. The steps are similar 

throughout the evolution because when the step is simple (see 1), there is 
not much margin for variation. 



26 


In Poor Richard’s words, "Haste makes waste," which is, actually, a nice definition of 
generalized temperature. 

From the PT standpoint, which is directly translatable into the chemical mindset, 
the acquisition of language consists of pattern analysis of speech, i.e., identification of 
generators, partition of generator space into classes, and selection of stable (regular) 
patterns that partition the configuration space. What is called Pattern Synthesis is the 
actual production of configurations, which is of no interest for us here, but should be for 
those working in speech generation. 

The steps will be described below. The reader should keep in mind that this is not 
a linguistic discourse but just a series of variations on the theme of “If words were 
atoms 


6. Bonding 


The concept of atomism is usually presented to schoolchildren as the granular structure of 
matter, but Lucretius, following Democritus, saw in bonding an intrinsic property of 
atoms: 


But now 

Because the fastenings of primordial parts 
Are put together diversely and stuff 
Is everlasting, things abide the same 
Unhurt and sure, until some power comes on 
Strong to destroy the warp and woof of each: 
Nothing returns to naught; but all return 
At their collapse to primal forms of stuff. 

{On the Nature of Things) 



27 


Chemical bond is well understood today. Formation of a bond, however, is a very 
general property of the world, extending far beyond the inanimate matter. Bonding, of 
which both Pavlov’s dog, salivating at the sound of the bell, and the inseparable Romeo 
and Juliet are two quintessential examples, is neither specifically human, nor specifically 
linguistic phenomenon. 

My central idea of language acquisition (most probably, not new) is: the new 

word never introduces itself alone. 

When a young monkey hears the warning cry of another monkey for the first time, 
the sign consists of a single “word,” but it associates with the subsequent sight of the 
predator and either a specific collective behavior or just a general commotion. If the 
predator never appeared after the cry, the effect would fade away. 

A word heard by a child for the first time comes either with another word or an 
object, gesture, action, sensation, appearance, etc. , i.e., as {A,B} , which leads to: 

{A,B} => AB , which implies AB C : 

{A,B} => AB => C; {A,B,C } c= G 

Two sensations that are close in time or space develop a bond between their 
representations. For example, the audible word “dog” and a visible particular dog can 
fonn a link. 



Figure 4. Linking of sound and sight and generation of the idea of dog 




28 


A and B can also be two words or just any sounds and their combinations, or an 
idea and a sensation. 

In short, if two sensations belong to the same temporal or spatial set, the bond 
between their representations “may follow,” which the symbol => , similar to the 
chemical —* , signifies. “May follow” means that this is possible, but how probable, it 

remains to be investigated. Moreover, the bond can fade away without repetitive 
stimulus. 

Chemistry is very unenthusiastic to the distinction between true and false, but 
takes a great interest in the questions like “how much? how soon?” even if the target of 
the question seems false. 

I am greatly tempted to extend the chemical analogy even further. All chemical 
reactions are by their very nature reversible. Only because of some special circumstances, 
like the irreversible escape of carbon dioxide, baking soda and vinegar cannot be 
reconstructed from the remaining products of their mixing. In a closed steel tube the 
equilibrium would be reached. 

The correct symbol for chemical reaction is , what is usually meant by the 
single arrow. If we modify our => into <=> (chemical <-» means a very different thing 

than ) , we will suddenly find ourselves facing the beautiful idea that if AB <=> C , 
then whether we somehow get in the focus of attention a single A, B or C, all three 
generators, A, B, and C will float there. It is only because of the limited capacity and 
speed of our mind that any word does not bring into memory at least one tenth of the 
entire Webster’s II New Riverside University Dictionary. If words were atoms, this is 
what would happen in an infinite time. 

Whether we should embrace the idea or not, I would spare it for a separate 
discussion elsewhere. It is related not to the subject of the poverty of stimulus but to the 
more general subject of deep analogies between all natural discrete complex dynamic 
systems and to the naive but deep questions like what is the difference between the dog 
and the word “dog ” and why there is no tangible thing called “animal” in the world. 



29 


If a word or sound is not tied to another word or sensation or something else, it is 
meaningless. All theories of meaning agree on it. Only some words are signs of external 
reality, but all of the atoms of language are meaningful, as a monolingual dictionary 
testifies. 


7. Acquisition of generators 


The visible delimiter, i.e., the space between words, is absent from Chinese, Thai, and 
ancient Greek texts, not to mention speech. Japanese gives some good cues. It is not seen and 
mostly not heard by children. Morphemes are embedded into words, too. If the child, 
ignorant of any theory, identifies verbal generators in the input and uses them, there must 
be a simple procedure to identify sound bites and their sequences as generators. 

The identification of generators is the purpose of pattern analysis. How do we 
know that a sign (word, morpheme, or phoneme) is a generator? The definition of an 
atom based on indivisibility is a negative one. There is no way to test the indivisibility a 
priori. Lo and behold, the atoms became indeed divisible. For a dog, goodboy is probably 
atomic. On the contrary, divisibility of speech can be easily established before the entire 
language has been acquired. 

Let us try the following simple rule based on divisibility: 

The word or morpheme is a generator if it enters at least two different 
configurations. This is how the rule looks in our quasi-chemical notation, defenseless 
against any mathematical criticism: 


{XB, XC} ^ XeG 

Or: if {XB, XC}, then, probably, XeG. 





30 


In other words, if at least two generators from a generator space G can fonn 
bonds with X, then X is a generator and it belongs to G. To put it even simpler, 
generator is what can bond with other generators but not what cannot be split into 
generators. 

By the same token, if a configuration can bond with other configurations or 
generators, it is also a generator. This is especially obvious for linear sequences. In 
writing, the hierarchy of generators is usually portrayed by using brackets of various 
types. Naturally, formal linguistics uses tree diagrams. 

The ultimate simplicity of local rules like the one just described, hypothetically, 
requires simple innate physiological mechanisms common for all species with nervous 
system. It does not require learning. 


8. Acquisition of bond space 


The sophisticated educated language is acquired by different means involving analysis, 
sometimes slow, of complex sentences and rhetorical devices as well as contact with 
complex subjects and situations. To learn chemistry, for example, means to leam what 
you can say about it that will be grammatically, contextually, and factually acceptable, 
although not necessarily true. I believe that it is crucial for understanding language 
acquisition to remember that even a complex literary, philosophical, or scientific text 
could be told in a simplified childish syntax by cutting the sentence into simple segments. 

Here is a single sentence form The Portrait of a Lady by Henry James: 

The large, low rooms, with brown ceilings and dusky comers, the deep embrasures and 

curious casements, the quiet light on dark, polished panels, the deep greenness outside, 



31 


that seemed always peeping in, the sense of well-ordered privacy in the center of a 
"property"—a place where sounds were felicitously accidental, where the tread was 
muffled by the earth itself and in the thick mild air all friction dropped out of contact and 
all shrillness out of talk—these things were much to the taste of our young lady, whose 
taste played a considerable part in her emotions. 

In the beginning, the conversion would go smoothly: 

There were rooms. There was a lady. The embrasures too. The light too. The 
lady was young. The rooms were large. The rooms were low. The ceilings were 
brown. The corners were dusky. ... The lady liked the rooms. ... 

But we would soon run into problems with these things were much to the taste of 
our young lady, which still can be dealt with, but sounds were felicitously accidental , 
and whose taste played a considerable part in her emotions are absolutely beyond a 
child’s experience. They are not observable by anybody but the author, in accordance 
with his esthetic position. To say and understand something like the above sentence 
requires not only a significant life experience but also an experience in reading literature. 
Moreover, it requires time to compose and optimize it. The average pre-school child 
naturally acquires language as a tool to study the higher floors of the edifice of literature 
at school where the artificial language of the civilization dominates. 

I believe that natural language acquisition ends with the ability to say who does 
what in connection to what or whom and in what fashion. The rest, starting with 
mastering compound sentences, is acquired by learning, analysis, synthesis, and 
conscious mimicking. It was my personal impression that the Russians with up to seven 
years of school, especially, in the countryside, hardly ever used compound sentences with 
more than one clause, which was not to the detriment of content. 

Many technicalities of general PT become highly simplified for linear 
configurations, even more so if we speak only about acquiring a language sufficient for a 
child to maintain balance with the limited social environment. 



32 


The generator of utterance has only left and right bonds. We can assume that they 
have bond value spaces B L and B R on each side. All p are various, possibly, nested 
tags ( modalities , along Ulf Grenander [ 13 ]) of the generator that signify its multiple 
categorization: dog is noun, animal, direct or indirect object, etc. on the left, subject, 
noun, animal on the right, etc. Its grammatical tags can be expressed as morphemes or 
even by capitalization, as in German, but I am interested how the bond space can be 
acquired in childhood and not analytically. 


'AL 

\ 

A: a,k,l ...p,q ...x,y 

F, 4? "OJ 

i 

\ 

1 


Figure 5. Language generator A 

In Figure 5, generator A has some bond values twice on both sides, which for 
example can happen with adjectives or listed nouns. The union of the left and right sets is 
a complete list of all bond values, which may be enough for languages with loose word 
order. 

For the left-to-right doublet AB, bond relation p = TRUE if 
Par £ B bl , and Pbl £ B A r , 

i.e., if the two generators have the same bond values at the bonds in contact (Figure 6). 
Thus, “Dog bites man” and “Man bites dog” are equally grammatical. What contradicts 
the factual content of the sentence, however, decreases the stability of the wrong version. 
This is why some politicians love meaningless cliches. 


Pl =x 


0 


A: 


Pr -a 


Pl =a 


-G> 


PR=y 


B: b.... 


-0 


Figure 6. Regular doublet 







33 


It all looks much simple for a chemist who would just say that A and B fonn a 
donor-acceptor couple. Even a linguist could not find anything new in the statement that 
there are grammatical and ungrammatical adjacencies between morphemes, words, and 
phrases, which is a fair way to put it. I only paraphrase common linguistic knowledge, 
although I suspect that for any common knowledge in non-experimental linguistics there 
are two conflicting views. What I claim as new (what can be new after Lucretius, anyway?) is 
that the rules of grammar, whether universal or specific, extant or extinct, do not need to 
be stored anywhere in the mind as a book on a shelf. They could be contained—and this 
is a hypothesis in need of a test—in the properties of generators, similarly to the way 
molecules assemble not by principles and parameters but by the properties of the atoms. 

The knowledge of regular grammar, from the point of view of a chemist, is 
distributed among generators, up to a certain level of language evolution. Philosophers, 
scientists, and writers have inflated the language to such excess that poor children have to 
study their native tongue for many years at school, picking up irregularities from the 
peers. 


9. Acquisition of bond values 


The next important question on the agenda of language acquisition is generator 
classification. How are the bond values, which are signs for classes, acquired? Here is an 
extremely simple and local rule: 

{AB, AC} => {B,C} <= G' <=G 

It means that if two generators B,C combine with the third A , they have the 
same bond value, acceptable by A. In other words, B and C belong to the same subset 



34 


G' of G. It means that the partition of generator space into classes is done by the 
generators they can share. “Tell me who your friends are and I will tell who you are” is 
invertible: “Tell me who you are and I will tell who they are.” The newly formed class 
can expand in the same way or disappear if the juxtaposition of AB and AC was 
accidental. This is how language can be acquired in a quite mechanical way by children 
who would give very little thought to anything but fun and who will rise to higher levels 
of language only when they develop abstract thinking that takes time and is not automatic. 

Of course, this is only a hypothesis. Probably there are some supporting or 
contradicting works in linguistics literature. 

Remarkably, every act of juxtaposition of two doublets with a shared element 
works both ways: (1) identifying a new generator and (2) identifying a class: 

{A— B , A— C} => A e G 

{A— B, A— C} => {B,C} <= G' c=G 

Next, {B,C} => D ! The class acquires its sign. 

The effect depends on whether A or the bond with A is new. The notion of 
novelty, absent from mathematics and physics, but not from chemistry, where a 
molecular structure does not exist unless a posteriory and de facto, is of cardinal 
importance for the formalization of evolution, which is exemplified by language 
acquisition. What the original UG concept seems to say is that there is nothing new in 
language acquisition but just selection from a timeless menu. 

A stone falls from the Tower of Pisa? No big deal, all possible trajectories and 
acceleration existed before the fall. The stone has chosen its trajectory, which is not so 
strange an idea for classical optics and quantum physics. 

Nevertheless, formal linguistics, driven by Chomsky himself, has been 
undergoing such an involution toward simplification (X-bar) and quantification 
(optimality), that sooner rather than later it is going to fuse even with the principles of PT, 


35 


like Monsieur Jourdain, without realizing it. I draw attention to PT not because I do not 
believe that linguistics cannot find its own way to consensus, but because it illuminates 
the place of linguistics among other natural sciences, where my native chemistry dwells 
nearby. 

I try to show in Figure 7 how the X-bar concept can be converted into its PT 
fonn by excluding imaginary XP and X'. The resulting generator has arity 3 (number of 
bonds) and is not good for building strings. This is why fonnal linguistics in all its forms 
has to bend over backwards to find imaginary movements and linearize the typically 
non-linear trees. 


Specifier 



Figure 7 , From X-bar (left) to generator (right). A: Adjunct, etc. 


If the notation {A,B}, as I said, means that A and B are not just elements of a set 
but are close to each other in some realistic sense, the transformation sign => in 
{AB, AC} => {B,C} needs clarification. If the elements are on the left of => , they 
are close in perceived reality. We say that they are in the same topological neighborhood 
in time and/or space and fonn a distinct cluster. There must be some physical or 
physiological reason why we assemble them in the brackets. Coming back to the analogy 

with the singles bar—a refreshing step away from X-bar—the brackets on the left of => 
represent the singles bar, say, at 8 PM and on the right we see the same bar at 10 PM. 




36 


The closeness does not mean the Euclidean distance. A good example is a circle 
of friends who may be separated by the entire continent or just a street but call each other 
over the telephone with the same minimal effort. A better term is the channel of 
communication, for which the maximum distance between humans we know is from the 
Earth to the Moon. At the other end of the human scale are communicating neurons. The 
physical counterpart of communication is interaction and the chemical one is collision. 

Human intellectual functions shuttle between the mind and the real world as— 
you’ve guessed it right— the bees shuttle between the flowers and the beehive. 

Expressions like {A,B} and AB => C suggest another obvious idea: the concept 
of a strong bond is an expansion of the concept of the set. In real world and in the real 
mind, elements are placed into the same set for a reason. They might form a bond if they 
stayed in the set long enough, and fuse if when they stayed even longer. We can say that 
{A,B} an d C are at the two ends of a continuous scale with AB somewhere in the 
middle. 


10. Locality 

Locality in current context means that in the process of acquisition and identification we 
look only within a 1-neighborhood (immediate topological neighborhood, see Figure 8 ) 
of the generators and not any farther than that. We do not need to consult either a 
grammar or an algorithm. No long tenn memory is needed to keep intennediate data 
because they all are on hand. 



Figure 8 . The substrate of local operations 




37 


The target for both identification and classification is the same: two generators in 
the neighborhood of the third one. The identification of the generator (if generator 
combines with two others... etc.) is simply seen from the other side: if two generators 
form bonds with the same third one, they belong to a subset (class) of G. Therefore, the 
class is defined by the affinity (or aversion) of all its elements to a common generator. 

As I believe, what the child gradually acquires is not any grammar as the list of 
rules, like the basic word order SVO, but the hierarchical partition of sounds into 
morphemes, words, word groups, phrases, and stylistic devices that constitute the 
generator space in which generators have specific bond structures, so that SVO order in 
English comes out automatically as soon as the abstract generators S, V, and O are 
fonned. Of course, the pre-school child has no idea about syntactic categories. When the 
speech is generated, the content and fonn are reconciled in the process of linearization 
toward the minimization of stress [3]. 


11. Some examples 

This is an imaginary way how the high level generators and patterns can be 
acquired by a child-robot: 

{eat—apple , eat—carrot } => {apple, carrot} c 
{eat—apple, take—apple} => {eat, take} c G 2 
Pattern: G 2 —Gi 

{Mary—eat, Mary—take } => Mary —G 2 
Pattern: Mary—CE—Gi 

{Mommy—eat, Mommy—take} => Mommy—Go 

{Mary—G 2 , Mommy—G 2 } => G 3 = (Mary, Mommy) c G3 cG 
Pattern: G 3 —G 2 —Gi 


Gj ci G. 
c=G 



38 


Much later the child learns at school that G 3 , G 2 , and Gi are terms of syntax 
and discovers that he or she had been guided by the invisible hand of the Grammar. 

Here is another imaginary example, inspired by observations on acquisition of 
German noun gender by native children [10A] : acquisition of the German case-gender- 
number nominal marking system. 

Vocabulary: Hund: dog; Hundchen: puppy dog, Mad: the word does not exist; 
Madchen: girl; -chen: suffix of diminutive form, marker of Neutral Gender. 

{Hund, Hundchen} => Hund- e G 
{ Das Hundchen, Das Madchen} => -chen e G 
{Ein_Hund-chen, Das Hund-chen} => -Hund- e G ; similarly, 

{ Hundchen, Madchen} => (-Hund-, -Mad- ) c: G. c h en 
{Das A-chen, Das B-chen} => Das—G. c h en ; (A, B) cG 

Next, consider the basic word order of subject, direct object, indirect object, and 
verb. For English Imperative, the triplet in Figure 9 serves as a template. 



Figure 9. Pattern template 

Here K and M are in the immediate neighborhood of L. K, M, and L represent 
whole classes of generators. 

The triplets constitute already a transition phase from Nean, the language of 
doublets, to the advanced grammaticalized language, in which the alternatives of word 
order start branching, leading to the abundant, but not unlimited, variety of languages. 

The probabilistic approach to syntax in [18 ] comes very close to the idea of Nean as the 

language with doublet or triplet syntax resulting from haplology [3]. Moreover, this work 







39 


suggests another idea: the sentence is generated from overlapping short fragments. 1 
would call it tiling. 

The advanced natural evolution of syntax when there was still no Henry James 
and William Faulkner in sight is a separate problem. As a mental experiment, we can try 
to guess why, under the pressure of linearization, in polysynthetic languages like 
Mohawk the L—M doublet develops into a verb prefix, in highly inflected Slavic 
languages, like Russian, the same doublet takes fonn of case endings, and in English it 
mostly evaporates, leaving only the ephemeral word order: 

Jack he-her-like Mary (Mohawk) 

Jack like-he Mary-her (Russian) 

Jack like-he Mary (English) 

All three syntactic constructs contain all necessary infonnation to avoid 
ambiguity. 

With an indirect object, “Give Alex the toy” would look like “You-him-it-give 
toy Alex” in a strongly polysynthetic language and “Give-you toy-it Alex-to” in a highly 
inflected language. Nevertheless, I could not find in my limited corpus of Mohawk, 
collected from the Web, anything like “you-him-it,” which would generate up to a 
thousand of different verb prefixes. 

Sak wa'-ku-hsvn-u-'. 

Sak fact-lsS/2sO-name-give-punc (linguistic glosses) 

Sak indeed-I-you-name-give-this moment 
‘I (hereby) give you the name Sak.’ 

Here (the example is from [19]) punc (', which is a sound) means that the action 
is one-time and fact (wa 1 ) means the factual mood. Morpheme ku means “I (subject) [do 
it to] you (object)” and “Sak” is not marked by any morpheme. The distinction between 
direct and indirect object, sharpened by word order or preposition in English, is blurred 
here, but the content is absolutely clear. This confirms to me that the distinction between 



40 


syntactic categories like direct and indirect object is rather artificial. In the configurations 
of thought [3] only bond couples are real. Moreover, I think, together with some linguists, 
that the grammatical categories like verb and noun are not without borderline fuzziness, 
especially in the non-Indo-European languages, but the Indo-European origin of grammar 
strongly influences our thinking. 

In the above Mohawk example [19] , “name” (hvsn ; v is a nasal vowel) 
immediately precedes “give” (u , another nasal vowel) creating the compound verb 
“namegive” (hsvnu), in a manner common also to German. But the Russian language 
does not mind scattering the three endings all over the phrase. The diluted medicine is less 
bitter, but the body of the Russian sentence can grow bloated. 

We can speculate about why the extraordinary redundancy of Russian and Polish (shared, 
it seems, by Swahili) has survived thousand years, satellites, and perestroika and why the 
inflective redundancy of Old English was so fragile. The tribal languages, it seems, aimed 
at rendering the nonlinear thought in the most straightforward way, making clear who did 
what to whom, without requiring any guessing. This holistic property of tribal 
languages—and they can have an astonishing complexity—was, probably, perpetuated by 
a limited number of situations meaningful for the tribal society. We can find the most 
generic ones in folk tales. This became an atavism after entering the modem era with 
literacy as the most powerful stabilizing factor. 

Unlike the English nation, the Russians did not know intense ethnic mixing. The Mongol 
invasion and 240 years of their domination added some words to the vocabulary but the 
two people did not actually mix. The Mongols either lived a separate life or accepted the 
Russian one. 

With PT approach we might find that the excruciatingly thorny problem of the 
evolution of language is not hopeless. Like in chemistry, we may find out what is “more 
true” and more expedient at given conditions, which are accessible to archeology. I can 
imagine a research—no doubt, very difficult—of relation between the way of life and the 
structure and vocabulary of tribal languages, but I am not aware of it. To separate 
language from conditions of life and culture would be equivalent to ignoring conditions 



41 


of chemical reaction, i.e., temperature, pressure, agitation, catalysts, acidity, irradiation, 
etc., by a chemist. Whether language defines culture, I have no opinion. But if culture 
defines language the observable result will be the same. This is yet another case of 
circular logic, which can be resolved only in one way: the less stressed complex of 
culture and language survives. 

Linguists cannot forget the debates over Benjamin Worf s denial of category of time in 
Hopi. It turned out that he was partly right. Some strange things with tenses happen also 
in Mohawk [19]. There is a sophisticated system of Future Tenses in Maya. In my view it 
is only natural that the perception of time in a tribal pre-industrial society could be very 
different from ours, shaped by timetable, heat engine, clock, and fertility of imagination. 1 
feel comfortable with the idea of the past as what will always be in the future, like the 
death of a relative, and the idea of the future that is a pure possibility, intent (1 will eat), 
or just present—all without any guarantee of realization. The present can be expressed 
not only by I’m going to eat, je vais manger , and eszni fogok (Hungarian), but also by 
simple Present used in Russian and Hungarian as Future with an adverbial modifier of 
time. The Bulgarian Future Perfect, with two auxiliary verbs and an unusual pattern of 
change, is truly remarkable from this viewpoint, as the entire florid Bulgarian verb 
system is. On the other hand, the ghostly Future in Japanese is another convincing 
illustration of the idea that, come to think about it, there is only Past and non-Past in the 
naive physics of the world. We can talk about future, but it certainly does not exist. Every 
philosopher starts with inventing his (this is a truly manly occupation) own language. 


12. Language and homeostasis 


From the PT point of view, the language generation amounts to the problem of pattern 
synthesis, which is the production of regular configurations of the same pattern. 

Our next and last step is pattern synthesis during language acquisition. How to 
ensure that the acquired language takes a grammaticalized shape and not just any shape, 



42 


but that of the surrounding language? How are patterns selected? In general, the generator 
space does not guarantee a unique way of self-assembly of generators into configuration 
neither in chemistry, as isomers demonstrate, nor in linguistics, as the languages of loose 
word order and even the two forms of indirect object illustrate. 

Unfortunately, as I believe, chemistry can tell us at this point very little, if anything at all. 
This pessimism is not shared by the whole school of evolutionary linguistics that simulate 
language evolution within the framework of competition and selection, based on 
groundbreaking models of Manfred Eigen and his group. This direction is represented in 
linguistics by Martin Novak, whose background is in mathematics, biology, and 
evolutionary dynamics [20]. Instead of criticizing these highly valuable and insightful 
models, under the spell of which I have been for over 20 years, I will make a constructive 
(I am sure, not new) suggestion of an ultimate simplicity. 

The language acquired by a child differs very little from the ambient one because 
of social homeostasis. Speaking a non-standard and peer-challenging language and suffer 
mutual misunderstanding as consequence would create a stress, which most human and 
animals, except some born leaders and troublemakers, would avoid by any available 
means. In psychology it is kn own as theory of balance. 

The patterns of speech based on the acquired generator space will be selected not 
so much by individual selective advantages as by the stability of the whole. Here I have 
little to say but to refer to [3] and [4]. 

I cannot resist a temptation, however, to generalize this principle over biological 
evolution, the area which, in spite of Darwin and molecular biology, is as far from 
consensus as linguistics. Evolution of species is not (I say it arrogantly, without any “not 
only”) the survival of the fittest, because the fittest is always the one who survives, but 
the homeostasis of the biosphere subject to external (for example, climatic), internal (for 
example, cyclic or catastrophic non-linear fluctuations), or just random perturbations. 

A non-equilibrium dissipative system, to which all life ( biosphere ) and its 
manifestations (noosphere) belong, searches and finds a way to end the stress of the 
perturbation. I believe this idea follows from the ideas of Ilya Prigogine [21 ] and 
William Ross Ashby. [22] The concept of punctuated equilibrium [23 ] is the closest to 


43 


it. One of the cardinal insights of this entire approach is that the disturbed complex 
dissipative system returns not to the previous state, which is hard to find among 
enonnous number of possible states and pathways, but to a new and more stable one. 

This is what makes dissipative systems so different from common chemical systems 
which automatically find the point of equilibrium. 

Homeostasis is a highly natural way of thinking for a chemist even though very 
few chemists deal with dissipative systems. 

What do you think happens when you disturb the universe by mixing baking soda 
and vinegar? It dissipates carbon dioxide and comes not to the previous state but to a new 
one, from which there is no way back to the previous one. Moreover, nothing else can 
happen there on its own. 

You can heat up and cool down a flask with chemicals millions of times with the 
same result, but the Sun warmed up the Earth and left it to cool down millions of times 
until life stepped out because the Earth was an open system. 

The live dissipative system can change many times without the interference of the 
chemist or, for that matter, anybody else, while the supply of solar energy lasts. 

To illustrate this idea, the discovery of the mineral fuel and heat engine was a great 
disturbance of the previous civilization. It first occurred locally, between Manchester and 
Birmingham. Today it brings global civilization in turmoil. To recuperate, we are burning 
the mineral fuel in increasing amounts until the homeostasis will be, hopefully or 
woefully, restored in a new civilization, which may not welcome humans as we know 
them at all. 



44 


Conclusion 


Why in the world are we speaking about thennodynamics? Isn’t chemistry far enough 
from linguistics? And isn’t it obvious that I cannot prove a word. No, I can’t. I do not 
consider myself an amateur linguist. I am only a chemist. But I would like to plant a seed 
of something other than just a doubt in Plato’s idea. 

Why the verb in Mohawk, Inuit, and other polysynthetic languages is so loaded 
with short morphemes indicating moods and aspects, not to mention major syntactic 
functions, why the verb in Japanese is practically naked and so is the attributive adjective 
in the noun- and verb-overdressed Hungarian, why French has a fair number of tenses but 
the otherwise much sparser English is not too far behind, why the Titianesque Russian 
has no article and the noun-frugal Bulgarian slaps the article onto the noun from behind, 
why English lacks diminutive, derogative, and affectionate suffixes, present in Italian, 
essential in Russian, and some showing up even in the stem Gennan, and why Yiddish 
has no simple Past Tense—such questions could be answered if a function similar to 
energy of a molecule or some other measure of stability could be found for any segment 
of speech—which is just a thought, crudely squashed (but don’t mince your words!) and 
drawn through a narrow hole regardless of the word segmentation. 

Of course, we speak as our forefathers did, but there was some reason why they 
had departed from their non-speaking forefathers. If there are some laws of nature, they 
apply to both our forefathers and little children. 



45 



Figure 10. Bees carry pollen, words carry grammar 

I believe linguistics could move closer to the status of natural and consensus- 
based science if one fine morning it discovered in the alien chemistry its own reflection 
in a gritty, wavy, cracked, but still a mirror. I hope chemists, on their part, could 
someday realize that they can give something else to the world except pollution, side 
effects, and genetic danger: the flowers of new universal ideas. 

As for the language acquisition, see Figure 10. 


NOTE (February, 2007) . In my view, nothing supports the “bee” mechanism of 
language acquisition as much as the data about bilingual children. They do not mix 
languages, although nothing seems to prevent them from mixing up at least the nouns. In 
the “bee” ideogrammatic language, the two bee species feed on two different species of 
flowers. 





46 


REFERENCES 

See also: http://spirospero.net/complexity.htm . 

1. Mason, Timothy. Could Chomsky be Wrong? 

http://perso.club-intemet.fr/tmason/WebPages/LangTeach/CounterChomsky.htm 

2. Baker, Mark C. The Atoms of Language. New York: Basic Books, 2001. 

Mark Baker’s publications: http://ling.rutgers.edu/people/faculty/baker.html 

3. Tamopolsky, Yuri. Tikki Ttikki Tembo: The Chemistry of Protolanguage, 2004 

http://spirospero.net/Nean.pdf 

4. -. Molecules and Thoughts: Pattern Complexity and Evolution in Chemical 

Systems and the Mind, 2003. 

www.dam.brown.edu/ptg/REPORTS/MINDSCALE.pdf 

Or: http ://spirospero .net/mindscale.pdf 

5. -. Transition States in Patterns of History. 2003. 

http ://spirospero .net/HistMath 1 .pdf 

6. Lakoff, George and Johnson, Mark. 1980. Metaphors we live by. Chicago : University 

of Chicago Press, 1980 

7. Lakoff, George.. Women, fire, and dangerous things : what categories reveal about 

the mind. Chicago : University of Chicago Press, 1987. 

8. Grenander, Ulf. Elements of Pattern Theory. Baltimore: Johns Hopkins University 

Press, 1995. 

Advanced works: 

-. 1976. Pattern Synthesis. Lectures in Pattern theory, Volume 1. New York: 

Springer-Verlag, 1976. 

-. Pattern Analysis. Lectures in Pattern theory, Vol. II. New York: Springer, 1978.. 

-. Regular Structures.Lectures in Pattern theory, Vol. III. New York: Springer, 1981. 

-. General Pattern Theory. A Mathematical Study of Regular Structures, Oxford, 

New York: Oxford University Press, 1993. 

9. Language Evolution. Edited by Morten H. Christiansen and Simon Kirby. Oxford: 















47 


Oxford University Press, 2003. 

10. Matthiessen, Christian and Halliday, M. A. K. Systemic Functional Grammar: A 

First Step into the Theory. 

http://minerva.ling.mq.edu.au/resource/VirtuallLibrary/Publications/sfg firststep/ 

SFG%20intro%20New.html 

11. Brian MacWinney’s home page with a library of papers. 

http://psyling.psv.cmu.edu/brian/ 

12. Pullum, Geoffrey K. and Komai, Andras. Mathematical Linguistics 

http://www.kornai.com/MatLing/matling3.pdf 

13. Grenander, Ulf. , Patterns of Thought. 

www.dam.brown.edu/ptg/REPORTS/mind.pdf 

14. Rosch, E. Human Categorization. In N. Warren (ed.) Studies in Cross-cultural 

Psychology. London: Academic Press, 1977 , vol. 1, pp. 1-49. 

-. Principles of categorization. In E. Rosch and B. B. Lloyd (eds.), Cognition and 

categorization. Hillsdale, NJ: Erlbaum, 1978, pp. 27-48. 

15 Megyesi, Beata. The Hungarian Language A Short Descriptive Grammar . 
http://www.speech.kth.se/~bea/hungarian.pdf 

16. Bourbaki, Nicolas. Elements of Mathematics: Theory of Sets, Boston: 

Addison-Wesley, originally published by Hermann (Paris), 1968, p.259-382. 

17. MacWhinney, B. J., Leinbach, J., Taraban, R., & McDonald, J. L. Language 

learning: Cues or rules? Journal of Memory and Language, 28, 255-277 (1989). 
http://psyling.psy.cmu.edu/papers/cues.pdf 

18. Lafferty, John, Sleator, Daniel, and Temperley, Davy. Grammatical Trigrams: A 

Probabilistic Model of Link Grammar. 

http://www.cs.cmu.edu/afs/cs.cmu.edu/ proiect/link/pub/www/papers/ps/gram3gram.ps 

19. Baker, Mark and Travis, Lisa. Mood as Verbal Definiteness in a “Tenseless” 

Language , Natural Language Semantics, 5: 213-269 (1997). 
http://ling.rutgers.edu/people/faculty/baker/mohawk-mood-prt.pdf 

20. Nowak, Martin, From Quasispecies to Universal Grammar, Z. Phys. Chem. 216 

(2002) 5-20. 

http://www.ped.fas.harvard.edu/pdf fdes old/ZPhysChem02.pdf 













48 


Martin Novak’s publications: 
http://www.ped.fas.harvard.edu/publications.html 

21. Prigogine, Ilya and Stengers, Isabelle. Order out of Chaos . New York: Bantam, 1984. 

Also: Nicolis, G. and Prigogine, I. Exploring Complexity. New York: W.H.Freeman, 
1989. Stengers, I. and Prigogine, I. The End of Certainty : Time, Chaos, and the 
New Laws of Nature, New York: Free Press, 1997. 

22. Ashby, W. Ross. Design for a Brain: The Origin of Adaptive Behavior, 2nd Ed., 

New York: Wiley, 1960. Originally published in 1952. 

- An Introduction to Cybernetics, London: Chapman & Hall, 1964. Originally 

published in 1956. 

23. Eldredge, N. and Gould, S.J. Punctuated equilibria: an alternative to phyletic 

gradualism, in Models in Paleobiology. T.J.M. Schopf (ed.). San Francisco: 
Freeman, Cooper, 1972, pp. 82-115. 




49 


APPENDIX 

THE CHEMISTRY OF THE THREE LITTLE PIGS 


As a preliminary illustration of some ideas expressed in this paper, I will explore a 
fragment of a text as a substrate for elementary local operations of generator and bond 
space acquisition. It is by no means a description of the mechanism itself because the real 
perception and processing of the text is diachronic, while my examination is going to be 
synchronic. The text here comes into view not bit by bit, as it should, but as a chunk. 

A simulation of language acquisition, as I believe, cannot be done with any single 
compact segment of perceived speech as input. A minimal requirement of a realistic 
simulation is a long series of language stimuli, coming in packages, like the Internet 
content, over an extended time, and against a background of realistic interactive content. 
This is a difficult task, remotely comparable with designing a game like the Sims, 
http://thesims.ea.com/us/ . 

There are scores of works on child language acquisition, and corpuses are 
available, but direct observations of children are intrusive and difficult to realize on a 
large and comprehensive scale, as even the works on chimp language testify. 

We have numerous theories of the origin of the universe, life, and language because we 
cannot observe the origins. Nevertheless, most scientific theories work near perfectly 
every day. The bulk of linguistic theories can be tested by building a talking, writing, 
and translating machine that develops its abilities in human environment, from scratch, 
and without any algorithm. This is still easier than to create the universe. 

The target text is a compact modified fragment from the tale of The Three Little 


Pigs. 



50 


Source: Jacobs, Joseph. "The Story of the Three Little Pigs." English Fairy Tales. London: 
David Nutt, 1890. http ://www. surlalunefairytales,com/index .html 


The target text P is a character array of 130 words, given here in the form of 
MATLAB input: 


P = char ‘there’, ‘was’, ‘an’, ‘old’, ‘sow’, ‘with’, ‘three’, ‘little’, ‘pigs’, ‘and’, 
‘as’, ‘she’, ‘had’, ‘not’, ‘enough’, ‘to’, ‘keep’, ‘them’, ‘she’, ‘sent’, ‘them’, ‘out’, ‘to’, 
‘seek’, ‘their’, ‘fortune’, ‘the’, ‘first’, ‘that’, ‘went’, ‘off, ‘met’, ‘a’, ‘man’, ‘with’, ‘a’, 
‘bundle’, ‘of, ‘straw’, ‘and’, ‘said’, ‘to’, ‘him’, ‘please’, ‘man’, ‘give’, ‘me’, ‘that’, 
‘straw’, ‘to’, ‘build’, ‘a’, ‘house’, ‘which’, ‘the’, ‘man’, ‘did’, ‘and’, ‘the’, ‘little’, 

‘pig’, ‘built’, ‘a’, ‘house’, ‘presently’, ‘came’, ‘along’, ‘a’, ‘wolf, ‘and’, ‘knocked’, 
‘at’, ‘the’, ‘door’, ‘and’, ‘said’, ‘little’, ‘pig’, ‘let’, ‘me’, ‘come’, ‘in’, ‘the’, ‘pig’, 
‘answered’, ‘no’, ‘the’, ‘wolf, ‘then’, ‘answered’, ‘to’, ‘that’ , ’_’ , ‘then’, T, ‘II’, 
‘puff, ‘and’, T, ‘II’, ‘blow’, ‘your’, ‘house’, ‘in’, ‘so’, ‘he’, ‘puffed’, ‘and’, ‘he’, 

‘blew’, ‘his’, ‘house’, ‘in’, ‘and’, ‘ate’, ‘up’, ‘the’, ‘little’, ‘pig’, ) 


We apply to the target the following transformations written as “chemical” 
reactions, in which X is a variable. By “chemical” I mean, actually, “pattern-theoretical,” 
but I cannot use the latter term because the ideas are not explicitly formulated in PT. I 
infer them perhaps incorrectly. Equilibrium is a “chemical” counterpart of equivalence 
and association in cognitive sciences. It means, half-seriously, that if one thinks about 
three little pigs (A), the wolf (B) promptly comes to mind because the entire story (C) is 
remembered. The story is in equilibrium with all its components, which is pretty close to 
the chemical idea of equilibrium. I cannot invade the heartland of cognitive sciences, but 
from a distance I would repeat again the parallel between the chemical flask and the mind. 


{A,B} <=> {AB} 
{AX,BX} => X e G , 
{A,B} 

{A,B} 

{AX,BX} <z> CX 


bonding equilibrium (1) 

generator identification (2) 

generator categorization (3) 

representation equilibrium (4) 

bonding categorization and 
its representation equilibrium (4) 



51 


Using a simple program, a vocabulary of 71 words, including space (_ ), was 
extracted from P and the words were analyzed for their left and right neighbors in P. The 
results are in the Table: 


Table : Vocabulary and neighborhoods of The Three Little Pigs 


Left neighbor 

No. 

Word 

Right neighbor 

fortune him house 
house said in 
answered no that in 

1 

- 

the please which 
presently little the no 
the then so 


2 

there 

was 

there 

3 

was 

an 

was 

4 

an 

old 

an 

5 

old 

sow 

old 

6 

sow 

with 

sow man 

7 

with 

three a 

with 

8 

three 

little 

three the _ the 

9 

little 

pigs pig pig pig 

little 

10 

pigs 

and 

pigs straw did wolf 
door puff puffed in 

11 

and 

as said the knocked 
said I he ate 

and 

12 

as 

she 

as them 

13 

she 

had sent 

she 

14 

had 

not 

had 

15 

not 

enough 

not 

16 

enough 

to 

enough out said 
straw answered 

17 

to 

keep seek him build 
that 

to 

18 

keep 

them 

keep sent 

19 

them 

she out 

she 

20 

sent 

them 

them 

21 

out 

to 

to 

22 

seek 

their 

seek 

23 

their 

fortune 

their 

24 

fortune 


which and at 

_ UP 

25 

the 

first man little door 
pig wolf little 

the 

26 

first 

that 

first me to 

27 

that 

went straw 

that 

28 

went 

off 




went 

29 

off 

met 

off 

30 

met 

a 

met with build built 
along 

31 

a 

man bundle house 
house wolf 

a please the 

32 

man 

with give did 

a 

33 

bundle 

of 

bundle 

34 

of 

straw 

of that 

35 

straw 

and to 

and and 

36 

said 

to 

to 

37 

him 



38 

please 

man 

man 

39 

Sive 

me 

give let 

40 

me 

that come 

to 

41 

build 

a 

a a your his 

42 

house 

in in 


43 

which 

the 

man 

44 

did 

and 

little little the little 

45 

Pis 

built let answered 

Pis 

46 

built 

a 


47 

presently 

came 

presently 

48 

came 

along 

came 

49 

along 

a 

a the 

50 

wolf 

and then 

and 

51 

knocked 

at 

knocked 

52 

at 

the 

the 

53 

door 

and 

Pis 

54 

let 

me 

me 

55 

come 

in 

come house house 

56 

in 

and 

pig then 

57 

answere 

d 

_ to 


58 

no 


wolf 

59 

then 

answered I 

then and 

60 

I 

11 11 

I I 

61 

11 

puff blow 

11 

62 

puff 

and 

11 

63 

blow 

your 

blow 

64 

your 

house 


65 

so 

he 

so and 

66 

he 

puffed blew 

he 

67 

puffed 

and 

he 

68 

blew 

his 

blew 

69 

his 

house 

and 

70 

ate 

UP 

ate 

71 

up 

the 




53 


The following is a kind of chemical analysis of the table. 

We encounter some doublets and triplets of high occurrence in everyday speech, 
for example: 



In chemical language, if used frequently, the doublets and triplets can crystallize 
and fonn composite generators, provided the abstract temperature, which is the level of 
chaos, is low enough. In P, however, the statistics is meaningless because of the small 
size. 

The following is a series of examples of what can “chemically” happen with P 
as a substrate. 


1 . 


25 


the 


first man little door pig wolf little 


TH E creates the tentative class of all words right of THE. The classification 
may diachronically survive or fall apart. We need a name for the class, and THE-X is a 
natural one. 

Class THE-X: X= {first, man, little, door, pig, wolf} 

We know that THE-X includes both nouns and adjectives, but the child-robot 
does not know grammar. 









54 


2. Similarly: 


31 


a 


man bundle house house wolf 


Class A-X: X= {man, bundle, house, wolf } 

These two classes can be expressed in tenns of the vocabulary entries. 

Since the class is in equilibrium with its entries, MAN is in equilibrium with its 
class (possibly, one of many). 

Class X-MAN : X = {a, the} and, therefore, X= A-X , THE-X. 

Confirmed by many occurrences, this classification will, most probably, survive. 

But then A and THE form also a class, for which we are out of the names other 
than cumbersome A_THE. Of course, we now know the current name of the class: 
article, by the way, absent in many inflected languages. 


45 


Pig 


built let answered 


3. Similarly: 


an 

5 

old 

sow 


and 


three the _ the 

9 

little 

pigs pig pig pig 


would allow for inferring the distinction between nouns and adjectives, not quite reliable 
yet: 


X-Adjective-Y: X=Article, Y= Noun 







55 


There is not enough data to form the class of nouns, however, but we can easily 
imagine that with enough verbs. 

The above examples could make us feel what a little child feels when acquiring 
the knowledge of the world: what we know and what seems elementary and obvious must 
to be retrieved from the fonnless mass like the statue of David from the block of marble. 
Unlike the sculptor, who cannot make a big mistake with the stone, the child’s mind 
works like a scientist—or living nature—creating, testing, and rejecting hypotheses. 

The overall picture of language acquisition—and, therefore, of language 
genesis—becomes the field for competition of patterns, which are counterparts of 
biological species, and not individual sentences. When the starting pattern is as simple as 
doublet or triplet, further mutations can generate the largest variety of grammars, which 
explains why the languages are on the surface so different. The mutations of developed 
grammars are, of course, less radical. 

The Gennan separable verb prefixes seem to contradict the principle of locality, 
but if we start with simple situations and short phrases, Gennan is no more strange than 
Japanese with its verb invariably at the end. 

We can hope to reconstruct the process of linguistic genesis for two reasons: (1) 
we can understand the world of the first speakers where somebody does something to 
somebody or something, (2) we have only two choices for adding a new generator 
(morpheme): left and right of the old one. 

The short fragment illustrates only the main principle: if the words were atoms, 
there would be a chemistry of words. Linguists can easily see some parallels with the 
widely used connectionist models, methods of statistical inference, Bayesian 
categorization, and the so-called memoryless learning algorithms, when the next entry 
either confirms or contradicts the already fonned rule, but the data are not stored. 
Language acquisition fits into the fast growing area of unsupervised learning. It may turn 
out that there is much more consensus in linguistics than it appears, but the various areas 
do not have a lingua franca. The cobbler walks barefoot. 



56 


The comparison of “chemical linguistics” with current approaches is a separate 
topic to be discussed elsewhere. I will refer here only to the close in spirit and crisp in 
ideas work of Sylvain Neuvel and Sean A. Fulop, Unsupervised Learning of 

Morphology Without Morphemes, http://arxiv.org/abs/cs.CL/0205072 , where the sign <-> 
in 

Wa <*\X% 

means “bi-directional implementation,” which is very close to what I want to express 
with my sign <=> and what is the closest parallel to chemical equilibrium. The algorithm 
for morphological analysis, i.e., identification of morphological generators, as I would 
say, based on this principle, works on a POS-tagged lexicon. POS here means not 
poverty of stimulus but part of speech. What I expect from the three little pigs, however, 
is the POS-tagging. 

Being strictly local, the “chemical” or “unsupervised” mechanisms seem to lead 
toward the distributed intelligence, working at many levels, from morphemes to phrases, 
creating, literally, a distributed grammar. It may open way to new types of hardware 
based neither on linear nor on parallel but on commutation processors [4] that imitate the 
chemical reaction vessel. In such hardware, an artificial molecular chaos must be 
maintained. The World Wide Web is a prototype of such a machine. The WWW is a big 
distributed intelligence, but the remaining problem is how to turn human minds into 
extremely simple automata with the properties of neurons without the properties of agents. 

The examples in Appendix illustrate nothing but a vague guess. Its further 
development, as well as comparison with other linguistic models of acquisition should be 
better left to those off-beat bees who might become attracted by the chemical smell of 
strange flowers. The entire direction of Darwinian linguistics, started by Manfred Eigen 
and continued by Martin Novak may then look like a blooming meadow. But as a chemist, 
I cannot resist my addiction to chemical smells. 



1 


THE CHEMISTRY OF SEMANTICS 

Yuri Tarnopolsky 

2005 

ABSTRACT 

This preliminary e-paper continues the examination of language as a quasi- 
molecular system from the point of view of a chemist who happens to ask, “What 
if the words were atoms?” The general principles of atomism, locality, simplicity, 
and transition state are applied to the unobservable structure of thought, regarded 
as a configuration in Pattern Theory (Ulf Grenander). The non-linear thought 
interconnects maximum three atomic observables and the coordinate of change in 
the configuration. The thought is subject to linearization in order to be translated 
into an utterance. The verb is regarded as the lexical marker of the coordinate of 
change. 

KEYWORDS: semantics, thought, idea, meaning, language acquisition, protolanguage, 
Navajo, language, pattern theory, chemistry, chemical nomenclature, speech generation, Ulf 
Grenander, Leonard Talmy. 



What if the words were atoms? 



THE CHEMISTRY OF SEMANTICS 

Yuri Tarnopolsky 


2005 





3 


1. EXAMPLE 


Left side: Horemheb gives drinks to Hathor. 
A copy of a picture from the tomb of Horemheb, 
who became pharaoh around 1321 BC. 

Right side: Written representation of the idea 
of giving in Egyptian, Chinese, English, 
Swahili, Russian, Navajo, and Hungarian. 

I see the pharaoh, the goddess, and the 
drinks. But where is the give? And what 
about the to? 



Q_ A 

give 

kupa 

aTB 

nikaah 

ad 


2. INTRODUCTION 


It is quite remarkable that the subject of meaning, with which everybody is intimately 
familiar, is so evasive. The mind is still an open frontier for thought. 

The literature on semantics and its various aspects is enonnous [1] but not really 
contentious. It seems that any contributor to it is able to offer his or her own system and 
notation to represent meaning, without joining an army, however. 




4 


As I believe, the lack of a consensus combined with the absence of an intense debate is a 
good evidence that the subject of a discussion either does not exist or should be looked at 
from a different angle. 


There could be little, if any, debate about the meaning of the absolute majority of 
particular words (although even the existence of word is debated), but not about meaning 
itself. Does meaning exist? Is it anything else than what we find in any monolingual 
dictionary? For my part, I am completely satisfied by the article meaning in Webster’s II: 

1. Something signified by a word. 2. Something one wishes to convey, esp. by language. Etc. 

If I am content with the obviously vague “something,” it is because I usually kn ow what I 
myself mean (but see my concluding remarks). The problem arises when we realize that 
although the dictionary definition is the result of a consensus, no dictionary can tell you 
what Mr. Smith means right now by saying “I don’t like it at all” or Mrs. Johnson wished 
to convey yesterday by saying “So nice to see you.” Are meaning and intent the same? 
There are split opinions, too. One can find in the literature a collection of mental 
variations on the theme of meaning as Baroque as Bach’s Goldberg Variations. 

It is, in short, music which observes neither end nor beginning, music with neither real climax nor 
real resolution, music which, like Beaudelaires's lovers, "rests lightly on the wings of the 
unchecked wind." Glenn Gould http://www.rjgeib.corn/music/Top-Ten/gould.html 

I am neither a linguist nor a philosopher. All I want is to launch the problem on the wings 
of the chemical wind and see whether it flies. 

As somebody who did some volume of translation, I can testify that there is no general 
problem in rendering meaning across the language barrier. All problems of translation, 
however pervasive, are minor, specific and local. Although there is no way to know what 
exactly people meant by writing a sentence in Hungarian or Japanese, we can usually 
have a very good guess and normally kn ow it precisely. Even machine translation can do 
a good job in most cases. Moreover, machine translation within a very narrow 
professional context, for example, weather or medicine, where consensus dominates, can 



5 


be very successful. The absolute majority of professional languages are tailored to avoid 
(and rarely to enhance) ambiguity. All I needed in order to translate a piece of technical 
literature from a foreign language was to understand how the described device or system 
looked, worked, or behaved. I did not care about what the author thought. To understand 
meant to say it in my native language, although some moment of a sudden revelation 
could precede it in hard cases. 

I believe that theoretical semantics is an attempt to represent by means of human 
language what by definition is as little observable as a deity: human thought. This 
apparent paradox dissolves if we separate language from thought and thought from 
meaning. Language is the language of thought. Period. Without language the thought is 
mute. But what is thought? Language is a kind of order. Is thought a kind of disorder? 
This verbal prestidigitation makes some sense to a chemist. 

This e-paper continues the examination of language as a quasi-molecular system from the 
point of view of a chemist who happens to ask, “What if the words were atoms?” [2]. By 
analyzing the text of a folk tale as a sequence of short fragments and following a few 
simple rules we can see how the language acquisition creates a quasi-chemical system in 
which some sequences (“molecules”) are more probable than others and some are quite 
improbable. For example, in the system of the Tale of The Three Little Pigs [2D] we can 
generate sentences Pig built a house and Man gave the straw. Although Pig gave the 
straw and Man built a house (as well as other combinations) are not supported by the 
context, the pattern Subject Verb Object is. This all is well known in computational 
linguistics and all a chemist could add is a different mode of thinking, which I call— 
defiantly— simplicity. It is natural for a chemist to regard anything complex as a structure 
(molecule) of atomic (i.e., simple) objects held together or rearranged by local (i.e., 
simple) interactions. 

Although the idea of a computing homunculus is deeply alien to chemistry, the idea of 

chemistry is not alien to computation [2G]. Neither is the idea of language to chemistry. 



The problem, illustrated by the above examples, from pharaoh to pig, is: what happens 
right before we irreversibly release the utterance into the air and even before we silently 
fonnulate it to be released ? 


6 


3. CHEMISTRY AND CONFUSION 


Chemistry found itself in the state of a teenage confusion right before turning into science 
about one and a half century ago. It was clear that matter consisted of some unobservable 
units that could be combined in different quantities. This paradigm, as old as science 
itself, had been laid out in eloquent verses by Lucretius, the Roman devotee of much less 
preserved by time Democritus the Greek. 

What chemistry accomplished, due to Alexander Butlerov (1828-1886) and Friedrich 
August Kekule (1829-1896), was to separate the chemical reality, as it is seen—but not 
understood—through the transparent wall of the flask, from what we cannot see but can 
understand. It was done by a kind of minimalist program: we have no idea what is going 
on there, but we are able to deduce which Democritean entities (atoms of chemical 
elements) are in the substance and which of them are connected. 

Starting with Jacobus Van’t Hoff (1852-1911), the very first Nobel Laureate in chemistry (1901), 
chemistry has been eagerly acquiring from physics all the new gadgets invented to see the 
atomic details directly. Van’t Hoff was among the first to look molecules right into the 
face (and what he saw was a kind of the double image which the drunks are believed to see). Before the 
modem instrumentation, the fact of the absence or presence of a chemical bond between 
two atoms could be only deduced logically from the observable behavior of the molecule, 
for example, by chemically splitting A—B—C—D in the middle and identifying the 



7 


fragments A—B and C—D. In the art of connecting the dots the chemists had to have 
the skills and talents of Sherlock Holmes and Dr. Watson combined. 

If thought is represented as atomic entities connected in a certain order, which is what 
semantics does for living, the way chemistry handled the invisible may be instructive for 
semantics. In what order? This we can deduce from human behavior, especially, from 
speech. But, as chemistry discovered at some point around 1950, the real state of 
molecular affairs in the flask was a kind of creative chaos that could not even be denoted 
by common chemical formulas. This discovery of the role of the fuzzy and irregular 
transition state in chemical transfonnation was the final act of maturity for chemistry as 
science —which is not the same as ultimate wisdom, of course. What we can learn from chemistry 
is how the evasive and fleeting state of confusion (thought) creates certainty (language). 

Should we wait until neurophysiology provides us with some kind of a device to observe 
the thought? This is a terrifying prospect for a human being who will turn into a simple 
screw of the social machine, so that the pipe dreams of Joseph Stalin would become 
reality overnight. Fortunately, there is a big snag for the physiology: the brains of the 
poor mice are split open to researchers, but mice do not think the way humans do, if they 
think at all. The brains of the humans—the last wilderness on earth waiting for a 
bulldozer—are for a while relatively protected, unless spilled out electronically. 

There are two areas of chemical reality. The first one includes the stable tenninal objects 
such as molecules (atoms are not always stable). The language for verbal one-to-one 
description of molecules of any size and structure is known as chemical nomenclature [3]. 
It can be found in any textbook of organic chemistry. In principle, it is similar to a 
system of rules and conventions used to identify the appearance of an animal in zoology, 
a plant in botany, or a human appearance in criminology, but with a substantial difference: 
the atomic composition and the connections between atoms can be completely extracted 
from the linear verbal description and vice versa because of a one-to-one correspondence 
between them, although molecular structure is typically non-linear. As I repeated many 
times [2], this unnoticed elephant in the room is well worth attention of linguistics and 
semantics 



8 


The second area comprises the chemical change, i.e., variations and transfonnations of 
chemical composition and structure. The chemical names are all nouns, but the terms of 
chemical dynamics are mostly either verbs or related verb-derived nouns, (like to 

substitute or the substitution). 

There is another subtle but important aspect of Chemicalese. If a chemical paper (and for 
that matter any scientific paper) contains a description of an experiment, it is done in a 
particular, significantly standardized language which, like the language of chemical 
nomenclature, is designed to ensure the exact reproduction of the experiment by any 
other chemist, as the following fragment [4] illustrates: 

Exp. 4. Reaction ofNaphthazarin (3) with Cyclohexa-1,3-diene 
To a solution of naphthazarin (3) (1.9 g, 10 mmol) in toluene (40 cm 3 ) was added 
cyclohexa-1,3-diene (2.43 g, 30 mmol). The mixture was heated at reflux under argon for 
3 days. After removal of the solvent and residual diene by evaporation, the crude 
cycloadduct (2.62 g) was obtained. 

Dr. Jonathan Miller’s home page, from which I quote, is a delightful induction into the 
atmosphere of organic chemistry. 

An especially concise style of experiment description is used in Chemical Abstracts. 

This kind of language is intentionally primitive as compared with the more sophisticated 
(although dry enough) language of chemical theory. The latter, however, loses all its luster against 
extremes of Legalese and Greenspanese at the high end of the art of obfuscation. Anyway, the rest of 
the chemical paper—introduction, review of literature, discussion, and conclusion—is not 
as rigid as the experimental part. 

As a stalker of the invisible, I limit my linguistic interests to human language in its earlier 
stages, when it was emerging from the hypothetical protolanguage Nean [2B] as a no¬ 
frills means of communicating information in a live coverage style. We can see a 
reflection of this stage in the development of language by infants and little children. 



9 


By definition, infonnation is conveyed if a message is expected with sufficiently low 
probability, such as about appearance of a predator, prey, or a loss of a hunter. The 
language of chemical experiment is a good model of the early evolutionary stage when 
the things are conveyed typically only if there are alternatives. I see the upper end of this 
particular stage as the emergence of plans, desires, myths, and tales describing 
imaginary or unobservable events in a maximally simplified (from modern point of 
view) matter-of-fact language. 

It is rarely possible to translate good poetry. Nevertheless, Ivan Bunin’s famous 
translation of Henry Longfellow’s Hiawatha into Russian was regarded by some as better 
than the original. 1 explain its success by the descriptive, explicit, and universally human 
nature of the mythological material. 

Following the example of chemistry, let us call the atoms of thought ideas and 
acknowledge that they can be of two kinds (innumerable literature has been piling up 
since times immemorial). One kind comprises the terminal ideas that stand for (mean, 
denote, signify, relate to) individual objects: dog, pine, man, shadow, at which we can 
point a finger. Qualities and quantities, like red, big, slow, form bonds with them but are 
not material in the same sense dog Spot is. Neither are verbs, like run, sit, eat. They all 
belong to the second kind, together with other ideas like from, inside, under. I am not 
ready to suggest any terminology except primary and applied ideas. The give and the to 
in the Example with which this paper starts are applied ideas, invisible per se. Thought is 
a set of ideas. A thought itself can be an idea in another, more complex thought. 



10 


4. THE GIVE AS AN IDEA 


The give, which I was looking for in the Egyptian wall painting, is something that I do 
not know what it is per se. It occupies a place in my mental construct involving other 
components, which unlike the give are directly observable and belong to the first kind. It 
has some relation to the verb give. No wonder I cannot see the give because it is an 
applied idea —as applied as the paint to the wall of Horemheb’s tomb. We need some substrate to 
apply it to. 

The following is a tentative imitation of a chemist’s approach to thought as a molecule. 
This parallels the approach of Pattern Theory [5] in which atoms are called generators 
and molecules are configurations. In addition to that, Pattern Theory systematizes 
configurations and their transformations as patterns, which chemistry also does. 

Imagine—departing from Egyptian paintings—that we are watching a real life or soap 
opera episode with Ken (K) and Lucy (L), two people we can see and represent as 
primary ideas. Ken is giving (g , applied) Money (M: a bag of collectible quarters, also a 
primary) to Lucy right before our eyes, from hand to hand. 

This is how the episode could be described in Nean [IB]: 

Ken. Lucy. Ken Money. Ken give. Ken Money, give Money, 
give Lucy, give Money. Money Lucy. Money Lucy. Lucy Money. 

Do not underestimate the raw expressive power of Nean, rich enough to do even without 
the verb give: 


Ken Money. Ken Money. Ken Lucy. Ken. Lucy Money. Lucy Money. 



11 


Something like Nean is used by sports commentators, especially for the fast moving 
soccer ball passed from a player to player: “Pele. Didi. Pele again. Didi. Pele.” 

Used in such manner Nean has no problem with differentiating between the donor and the 
acceptor if the narrator and the listener speak the same dialect of Nean. If Lucy was the 
donor, we would start with Lucy Money and end with Ken Money. The natural temporal 
flow of speech in the Nean grammar is the verb tense marker even without the verb. 

From my perch, Nean represents the very moment of conception of the verb and tense 
categories, as the old chemists used to say, in statu nascendi. A linguist could say that 
Money performs both noun and verb functions, quite natural for English, as the word 
hand does. So to speak, Ken monies Lucy. Compare with hypothetical Ken hands his 
hand to Lucy or Ken clubs a lion with a club. While the first is ambiguous (touches? 
offers? proposes?), the second is pragmatically unambiguous even as Ken club lion or 
lion club Ken. 

There could be a different solution for differentiating between Ken kill lion and lion kill 
Ken, as well as Ken break bone and Bone break Ken: to add semantic markers of 
distinction between a human and an animal, as well as animate and inanimate, or even 
between long and short, flat and round, etc. The nominal classes of Swahili and the 
hierarchy of nouns and the classificatory verb stems of Navajo are examples. The 
evidence of the original fusion of syntax and semantics is also seen in the noun classifiers 
of Chinese (“ measure words ”) and counting suffixes of Japanese . 

Semantic markers seem natural if the language evolves from overgeneralization (which 
happens with infants, too) when, for example, not only all liquids, but also some far- 
reaching chains of associations can have the same sign. 

In Sumerian, for example [6]: 

a , e 4 : n., water; watercourse, canal; seminal fluid; offspring; father; tears; flood, 

u: n., plant; vegetable; grass; food; bread; pasture; load 

See an interesting discussion of this subject in [6]. 







12 


To speculate further, the relatively redundant agreement, as between nouns and 
adjectives in Russian and Swahili, may be an artifact of Nean. 1 would even dare to 
suggest that all the markers come from the primary ideas of the inherently redundant 
Nean. They probably evolved as means of speeding up the transfer of information, which 
was, 1 think, the main driving force of early language evolution, quite similarly to the 
evolution of species, in modem terms, and evolution of technology in modern times. 


Next I will try to develop a chemistry-friendly representation of the non-linear thought 
that generates the linear utterance K-g-M-t-L, in which the letters are atomic symbols 
for words. 

I regard the pre-historic thought as a simple aggregate of a few ideas that are all 
interconnected just because they are together by some criterion, namely, belonging to 
the same thought reflecting the same spatial or temporal closeness. One could ponder 
whether the topology of thought is a full graph or the zero topology of a (simple) set in 
which elements are labeled, but neither connected nor have duplicates. 

Set with duplicates is called bag. Set is what collectors display in a collection where items do not 
have duplicates or as words in a dictionary. One can also remark that if thought is just a set, 
animals think too, which I would not dispute. Humans, however, went much farther, but how 
far—we will never measure until we know the starting point. 



Thought Speech 

( 2 ) ( 3 ) 


Perception 

(i) 





















13 


We can point to the primary and stable K, L, and M, but not to give or to, unless in 
speech. Let us tentatively reserve some space for the latter two in the perception and 
thought, denoting their ghostly presence with outline font and their connections with 
dotted lines. The lines mean that all the atoms of thought are more or less connected in 
the object of thought, although we may not know how exactly. This point is what 
distinguishes the approach of Pattern Theory [5] and chemistry to structures. The 
structure, whether molecule, thought, or utterance, is a configuration, while pattern is, 
approximately, a class of configurations, for which the rules of similarity within the class 
are formulated. 

In order to notice the action, we need to look at the event at least twice: 



( 4 ) 



( 5 ) 


Having done that we can, at last, notice that there are some changing connections 
between the ideas, invisible in a static picture. The strong (double line) bond between M 
and K (4) moves to the position between M and L (5). To put it differently, M moves 
(migrates) from K to L. 

In a chemical style notation: 

K—M + L -> K + M—L (6) or K—L ^ K—L (7) 

I I 

M M 

This transfonnation is a pattern, applicable to many primary ideas, and pattern is an 
overgeneralization, much exceeding in its range the Sumerian notion a (water, etc.). 


















14 


Focusing on the points of change, we come to the “chemical” equation, in which the 
broken lines symbolize the transition state: 

K—M + L -» [ K—M—L ] -» K + M—L (8) 

Chemical transformations in an isolated system are reversible and the transition state can 
lead to either the initial or the final state. The uncertainty can be reflected in language: 
Ken is considering giving money to Lucy. Or even: K and L fight for M. For an 
irreversible action, the single arrow indicates the direction of the transfonnation. 

Regarding transfonnations, the principle of atomism means that they can be decomposed 
into simple and further indivisible steps. In the spirit of chemistry we can define simple 
(atomic) acts as occurring within a topological neighborhood. The complex acts can be 
reduced to a sequence of simple ones. This principle strongly limits the variety of simple 
actions: 


A—B -► A B 

Ken Money Ken, Money 

(9) 

A B -► A—B 

Lucy, Money -► Lucy Money 

(10) 

A - 0 

Money -* (no money) 

(11) 

A 0 -► A B 

Adam -► Adam, Eve 

(12) 

Even already simple enough 

transfonnations can be represented as a sequence of the 

above simple steps: 



A—B -► A—C 

Sour grape -► sweet grape 


A—B -*■ A 0 ; 

A C -► A—C 

(13) 

ABC -► A—B C 

Dog bite man -► Man bite dog 



ABC -► A—B C -► A B C -» A C—B -► C—B—A (14) 



15 


There are maximum four atomic components of a change: fonnation (1) and breakup (2) 
of a single bond and appearance (3) and disappearance (4) of a generator. Jumping to 
thought, a simple thought P (the stem cell of all thoughts) is, at most, a quartet: 

P={A,B,A—(15) 

where A and B are ideas, A—B is a bond between them, and arrow -> is a coordinate of 
change, i.e., the locus of change, for example, appearance/disappearance of an atomic 
idea or locking/breakup of the bond. The arrow, for which pointer is a better tenn, 
points to the changing component by connecting the initial and the final states of an act 
of change. There are at least two possible notations for a change (16, 17): 

A—B -► A Ken Lucy -* Ken (alone) (16) 

A—B 

t 

A—B -► A—C Sour grape -► sweet grape (17) 

A—B , A C 

t t 

The arrow (pointer) is, in my opinion, the precursor of the verb. In early steps of 
language evolution it was probably a single sound or a gesture signifying or symbolizing 
action (the theories of gestural origin of language origin are vigorous today). It would be used only 
if the context was obscure. 


A—B 

\ / 

C 

( 18 ) 


Note that the “atomic” configuration A B C in (14) makes no sense in 
linear speech where the adjacency is already a bond. Neither does a 
cyclic configuration (18) , equally impossible in the linear speech. Both 
can be seen as short-living unstable stages of the process of linearization 


of thought into speech . What is wrong as speech is OK as thought. 



16 


Transition state in chemistry is anything but state. It is a process which can be 
approximated by a sequence of elementary steps, not in all-or-nothing fashion, but as a 
continuously changing distribution of bonds of variable strength. I must emphasize that 
the chemical transition state is still as little observable as the thought, although the 
situation could change. What helps, is that the elementary steps are always local. 


5. FROM THE POINTER TO THE VERB 


Strictly speaking, the give (the thought, not the verb) consists of four elements: 

{K, M, L, -> } 

I hypothesize that the thought give is a transition configuration between the 
representation of the perceived scene in the mind (which is outside linguistics) and the 
utterance that conveys the thought. 


Ken give Money to Lucy 

or ( 19 ) 

Ken give Lucy Money 

We still cannot put a finger on the arrow, but whatever it is, the four-component thought 
transition state is highly ambiguous: it can be linearized into utterance in 4!=24 ways. 

The alternative form Ken give Lucy Money is unambiguous only because Lucy is 
animate and Money is not, which we know, but, unlike in Swahili, do not mark. Ken put 



The give 








17 


box bag is highly ambiguous because box and bag can be put into each other. In a 
language similar to Navajo and more elaborate than Swahili, the difference can be 
marked by verbal object suffixes, and it in fact is marked in Navajo [7], where bag and 
box require different roots for verb give: ni'q for give box and niyj for give bag. The 
ambiguity can be eliminated if the position in the four-word sequence alone signifies the 
semantic role, which even the morphologically skimpy English language does relatively 
rarely. 

Grammar, from the chemist’s point of view, is a way (more accurately, a catalyst) to 
reduce the stress of the transition state. In terms of chemistry, it gives a preference for 
linearizing the thought in a particular way, reducing the entropy of choice, although the 
grammar can often be violated without detriment to understanding. The post-Nean 
grammar works by either attaching various markers or freezing the word order, or both. 

While the word order has its limits (only not more than three ideas fonn a linear 
neighborhood A— B —C ), even a single morphological marker can help a lot. 

If we attach a marker (in speech, not in thought!) of the direct object (Obj) to Money, 
for example, all three its transition bonds (red) are not needed anymore: 



Money, with its marker (infix or case ending), can be now placed anywhere in the 
utterance, although a certain predominant word order characterizes all languages. This 
ordering is necessary because the remaining transition triangle K, L, g is still too 
ambiguous to linearize. 






















If we add a marker for the indirect object (L) , in the form of a dative case (21) or a 
preposition (to), the only remaining ambiguity is the relative positions of K and g. 


18 



It can be disambiguated (1 hate this word!) in two ways: by marking either the subject or the 
pointer. 


Japanese has a prominent marker for the subject (-ga) , as well as for both objects (-0 and 
-ni), plus a topic marker (-wa). Nevertheless, the end position of the verb, which is also 
often marked with a suffix (-mas , -suru , -iru, etc. ) is fixed. Hungarian, which marks 
the objects (with -t and -nek) but not the subject, has no fixed word order, probably 
because the verb, lexically different from noun, is almost always marked. The Japanese 
sentence looks like a single word with fixed positions for all components, clearly marked. 
Swahili resolves the ambiguity by neatly packing all semantic object markers into the 
verb, already loaded with tense, but it still needs an SVO word order to keep both S and 
O close to V. Russian typically marks all it can mark but the subject, so that the 
unmarkedness of the subject is the marker itself, and Russian, like Hungarian, has a very 
free word order. Navajo lists the nouns in the very beginning in a certain order, according 
to their ra nk of animation: 

Human Infant/Big Animal Medium-size Animal -* Small Animal Natural 

Force Abstraction 

Such differences between the languages could be a fossil pit for paleontology of language. 

How can we mark an arrow which, theoretically, is always the same and should be 
assigned a single word? Whatever marker we attach to the same unique word, it will be 























19 


just another unique word, which, by the way, does not need any marking. The verb is just 
the lexical marker of the pointer, which is not the only way to do it because the verb, 
strictly speaking, is not necessary. Parts of speech, syllables, words—are they a matter 
of convention? Certainly not in English. But together with some linguists I strongly 
suspect that modem linguistics as the science of language bends, trembles, and groans 
under the heavy Indo-European burden imposed by Panini’s grammar of Sanskrit. The 
ancient speaker of the emerging language never knew what tenns the grammarians would 
use afterwards. 

As a free-thinker, 1 feel a great deal of mistrust when the linguists speak about SOV word 

order when the verb is always at the end. This is the verb-last order, people! 

Language evolved as patterns of thought and patterns of its linearization, which, as 
we see, can be done in a vast variety of ways. Neither word nor verb is a universal 
concept of language. Furthermore, the patters of thought are representations of patterns of the world— 
the idea clearly expressed by Spinoza—but all we know about the world is our representation of it. 

The thought in the following interlingual form (22) is, probably, sufficient to translate it 
into any language equilibrated with a relatively simple and stable reality. The arrow here 
points to K and M, which means that Ken has to part with his Money and Lucy is ready 
to accept the bag (we don’t know if she will) but not necessarily Ken himself. The 
Egyptian hieroglyph rdi is chosen here to represent the idea of give. 



Why then did the verbs develop at all if they were not necessary? This is something 
paleolinguistics should be concerned with. It is a pure speculation on my part, but I 
suggest that the patterns of human thought arose from the use of tools, which is rather 
atypical of animals. To kill, eat, mate, share, cheat, and fight required two components of 
thought. The tools and the hands to use them in various ways were, probably, responsible 






for the development of three- and four-component thoughts. Much more can be 
speculated on this platfonn, but I better stop. 


20 



Structure (23) gives some hints how thought could further develop complexity, while 
preserving its ancient core and compatibility with linearization, but it is beyond the scope 
of my strictly provisionary paper. 


5. NOTE ON GIVE IN NAVAJO (in which I am by no means an expert). 

Polysynthetic (a chemistry-smelling term!) languages, like Navajo, linearize thought in a 
very strange for Europeans, but basically natural way: the markers are attached not to the 
nouns, but incorporated into the “verb” and show who or what does what to whom or to 
what in what manner. In other words, the “verb” is the skeletal phrase, a linearized 
pattern of thought, preceded by the unmarked nouns [7]: 

Leeched mosi yinoolcheel 
dog cat it-is-chasing-it 

the dog is chasing the cat 











21 


Abandoning the firm chemical ground under my feet, I would say that the above example 
looks to me as the embryonic “stem cell” for all languages. 

The structure of Navajo seems overwhelmingly complicated, primarily because of the 
prohibiting complexity of the available printed dictionaries, but with some doggedness a 
chemist can see how it works. The language of chemical nomenclature [3] is a close 
relative of polysynthetic languages. Example [4]: 1,'4,4a, 9a-tetrahydro-5,8-dihydroxy- 1,4- 
ethano-9,\0-anthraquinone. See [4] (Exp. 4) for what it depicts. 

It is my uneducated guess that the elaborate polysynthetic languages, like Navajo (not too 
many of them are well studied), do not have verbs in Indo-European sense. Instead, they 
use ideograms, similar to Egyptian and Chinese ones, as well as to the German 
compound words, built not from graphic elements but from phonemes. The pictorial 
nature of Navajo (“mental television”) was noted by the native speakers. 

Why are tribal languages often very elaborate and sophisticated? Because thousands of 
years of stable ordered life with little social and political turmoil are beneficial for the 
slow evolution of complex patterns of culture. Just compare with the dumbed down, 
primitive, fragmented, kaleidoscopic, superficial, ephemeral, hectic, and hysterical 
culture of the twenty-second century. 

The Navajo “verb” is a whole proposition [7A,B] , which means that it is actually the 
entire phrase in which the primary components (Subject, Direct Object, Indirect Object) 
are always the same standardized elements, similar to pronouns and, if you wish, to the 
suffixes of classes in chemical nomenclature: hydroxy or ol for alcohol, di for dual 
plural, on for >C==0 group, etc. The primary elements are listed in the very 
beginning, usually without markers. By analogy with Japanese, where the topic of the 
proposition is a wa-ending noun or noun group in the very beginning, I would call Navajo 
a polytopical language: all its nouns are the participants of the action/state, and not just 
one of them, as in Japanese. They are like listed roles and actors in titles with which older 



22 


movies begin—but the stars go first—giving the clue to the components of the 
subsequent “verb.” 

One can imagine a process by which an ancient polysynthetic language, with not fully 
differentiated syntax and semantics, further evolves into a more segmented one, in which 
typical (for most of us) nouns develop from the anterior protagonist list of the sentence, 
typical verb develops from the posterior “action” part, and typical adverbs, adjectives, 
and pronouns scatter from what is in between, releasing also some nouns frozen into the 
“verb.” By backtracking we can descend to more primitive fonns in which noun and 
verb, the primary and the applied components, are not differentiated. In this way we can 
see relation between Navajo, Swahili, Japanese, and Indo-European languages. All 
languages are equal under Thought, but some are more equal than others. 

The noun classifiers in Chinese and classificatory verb stems in Navajo look like the 
same linguistic phenomenon connecting the two languages across the Pacific, which is 
not quite surprising after all if we accept the Dene-Caucasian hypothesis [8]. There is, 
however, much more similarity between Chinese and Navajo in the very nature of close 
relation between thought and speech, topicalization, and tightly fused sentence, while 
classificators seem to be a more ancient phenomenon. Both languages use ideograms: one 
for writing, the other for speech. Naturally, Chinese needs phonetic elements in its 
ideograms, which Navajo has gratis. Both are poorly equipped for representing new 
sounds and concepts. But the sparse, lean English-like Chinese needs more of context. 

The verb in the Euro-centric sense is always at the very end of the Navajo “verb,” where 
it signifies a very abstract idea of a pattern of change (for example, simple horizontal 
movement of Direct Object from Subject to Indirect Object, which is called migration in 
chemistry) and a more concrete but abstract enough situation to which it is applied 
(something in an open container), plus the very elaborate dynamics of the process (for 
example, tense, mode, and aspect). This is what nikaah (to give something in an open 
container, momentaneous imperfect form, 3 ld person singular, if I am not mistaken, in 
Example means, although this entire Note is a gross simplification. The stem of verb 



23 


nikaah is K4 [7A, p. 294; 7C ] , which means an object or substance in an open 
container. Nevertheless, if one can handle chemical complexity, so one can manage the 
complexity of Navajo—the language which is an adventure in itself. Chemistry becomes 
simple in simple cases and so does Navajo. The problems, however, arise when we apply 
a complex language to a complex reality, in both chemistry and linguistics. The slow 
communication makes little sense, unless in correspondence between philosophers. 

The wonderful worlds of Navajo and Chinese leave open unique windows into human 
thought, but they should be better left to another opportunity of discussing Nean and 
kinetic aspects of language evolution. 


7. CONCLUDING REMARKS 

Although I ignored here the Hymalayas of related literature, one major linguistic 
movement that starts with a natural-scientific analysis of an event, before it becomes a 
thought, must be mentioned. It is associated with the names of George Lakoff, Mark 
Johnson, and other enthusiasts of cognitive linguistics, to whom a chemist can feel an 
instinctive attraction [2B]. As Laura Janda writes, 

From the very beginning, cognitive linguistics has been a refuge for linguists who are 
intimately acquainted with real language data and have a profound respect for empirical 
methods [9 ]. 

There are, however, even closer kindred souls for a chemist who asks “what if the words 
were indeed atoms?” Leonard Talrny [10] develops in his cognitive semantics the entire 
directly observable physics of the world from the point of view of its linearization into 
speech. This seems to me a remarkably innovative and tall idea, to which the chemist 
alone would not arrive, but could post factum formulate it as: the things have 
geometrical, physical, chemical and other properties, but they also have a property 



24 


of being told about in a language. The languages of Navajo type open a window into 
the fascinating process of language evolution driven by the general kinetic properties of 
the world, noticed first by Rene Thom [11, 2B]. 

My first main idea is that all components of human thought, i.e., primary and applied 
ideas, are connected, although some are more connected than others. The connections 
may vary over the short time while the thought resides in the mind before either being 
abandoned, or dumped to memory, or expressed in speech or action. This assumption of 
an attention window strongly limits the size of the thought. Ars longa, vita brevis. The 
language of art, science, and culture may be long, but we do not have time or memory 
space for long thoughts. 

How much chemistry is in this strange conjecture? Quite a lot, I believe. All atoms 
interact at a short distance and either attract or repel each other, with repulsion 
predominant at a very close distance. This is how the proteins take their shape by 
minimizing the overall strain. I assume that all ideas in the thought are in the 
neighborhood of each other. The same is observed in speech and this is how the 
sentence takes its shape: “Speech in observed the is same” is utterly strained, while “In 
speech the same is observed” is much less so. 

The local principles alone are clearly insufficient for ordering modern sophisticated 
language, which is a result of a long cultural evolution influenced by writing and public 
oration. It is the language of style and culture, not just of the thought. But they might be 
sufficient to fill the gap between the protolanguage and the cultivated language of writers 
and philosophers with its long-distance interaction and central planning of the 
composition. They might also be sufficient for the spontaneous language stimulated by a 
fast developing event. Linguistics as a natural science must be the science of time, as 
chemistry dominated by kinetics is. 


There is no contradiction between thinking in language or in a mysterious language of 
thought. When we think in language, for example, internally formulate (“think over”) 



25 


what we want to say before actually saying that, we simply broadcast the silently 
prepared and memorized text. When we see a person who is in a danger of being hit by a 
car and want to warn him or her, we do not have time to think and are in the same 
position an ancient man was while watching a lion behind the back of his companion 
hunter. 


Linguistics as natural science cannot have sharp borders. But let us deal only with the problems 
related to thinking and generating speech. If we decided to deal also with actions, we would come 
to a firm conclusion that mice think, as all animals do, and this would make some of us feel very 
uncomfortable while eating them. 

My second main idea (partly resonating with the ideas of Sergei Starostin [8]) is that in 
order to understand language as a natural phenomenon we have (1) to start with very 
small and primitive systems, (2) to see how a small evolutionary step can be made, and 
(3) to apply the procedure repeatedly not only up to present, but also up to a not too 
remote future of language, so that we could check the theory. If this reminds of the 
method of mathematical induction or of classical Darwinism, I absolutely do not mind. 
The language acquisition by children does not recapitulate language origin. What both 
represent is evolution of complexity from simplicity by simplicity. 

W1SPERJNG TO A HORSE: My purpose was to test if I would be able to cross a river 
over a thin ice from the thought side to the speech side, or, one could say, to 
circumnavigate the brain from the right to the left hemisphere, in my chemical rubber 
boots. 1 am far from my goal and not even sure that 1 am crossing the right river and even 
know the right from left. Later 1 will, probably, give it another try by applying the 
principles stated here to the lingo-chemical system of The Tale of Three Little Pigs , 
which would supply the component missing here: pragmatics, i.e. the naive physics of the 
world. But do 1 really know what 1 mean by what 1 say? As 1 stated here and in previous 
e-papers, to understand means to tell somebody else. In a short story by Chekhov, the 
man tells his troubles to a horse. 



26 


REFERENCES 

More references can be found in [2]. 

See also http://spirospero.net/complexity.htm 


1. Chalmers, David. Online papers on consciousness, part 2: Other Philosophy of 

Mind. Compiled by David Chalmers, http://consc.net/online2.html 

2. Tamopolsky, Yuri. 

A. Molecules and Thoughts. Molecules and Thoughts: Pattern 
Complexity and Evolution in Chemical Systems and the Mind , 2003. 
http://www.dam.brown.edu/ptg/REPORTS/MINDSCALE.pdf 

or: http://spirospero.net/MINDSCALE.pdf 

B. TIKK1 TIKKITEMBO: The Chemistry of Protolanguage 
http://spirospero.net/Nean.pdf 

C. Pattern Theoiy and ‘Poverty of Stimulus” argument in linguistics. 
http://spirospero.net/Poverty of stimulus.pdf 

D. The Three Little Pigs : Chemistry of language acquisition. 
http://spirospero.net/3LP.pdf 

E. Salt: The Incremental Chemistry of Language Acquisition 
http://spirospero.net/Salt.pdf 

F. Salt 2: Incremental Extraction of Grammar by Simplistic Rules 
http://spirospero.net/Salt2.pdf 

G. Molecular computation: a chemist’s view. 
http://spirospero.net/PTutor.pdf 

3. Organic nomenclature 

http ://www.absoluteastronomy. com/encyclopedia/o/or/organic nomenclature.htm 

See also any textbook of organic chemistry. 

4. Miller, Jonathan. General Experimental Details 
http://www.ionathanpmiller.com/experimental/ 

In: Jonathan Miller’s Home Page. 
http://www.jonathanpmiller.com/ 


5. Grenander, Ulf. 















27 


A. 1995. Elements of Pattern Theory. Baltimore: Johns Hopkins University 

Press. 

B. 1993. General Pattern Theory. A Mathematical Study of Regular 

Structures, Oxford, New York: Oxford University Press. (Advanced). 

C. Patterns of Thought. www.dam.brown.edu/Dtg/REPORTS/mind.ndf 
Watch for updates; see also: www.dam.brown.edu/ptg/publications.shtml 


6. Halloran, John. The Proto-Sumerian Language Invention Process 

http://www.sumerian.org/prot-sum.htm ] 

7. Navajo Language 

A. Young, Robert W. and Morgan, William, Sr. 1992. Analytical Lexicon 
of Navajo. Albuquerque: University of New Mexico Press. 

B. McDonough, Joyce M. 2000. How to use Young and Morgan’s “The 
Navajo Language.” In K. M. Crosswhite & J. S. Magnuson (Eds.), University of 
Rochester Working Papers in the Language Sciences, 1 (2), 195-214. 

www.bcs.rochester.edu/cls/G000n2/mcdonough.pdf 

C. Analytical Lexicon of Navajo. Online, under construction . Stem KA. 
http://www.speech.cs.cmu.edu/egads2/navajo/entry7K%c2 

8. The Dene-Caucasian Hypothesis (John Bengtson, Sergei Starostin). Santa 

Fe Institute, http://ehl.santafe.edu/denecauc.htm . See also the entire EHL 
project. 

9. Janda, Laura. Cognitive Linguistics. 

www.indiana.edu/~slavconf/SLING2K/pospapers/janda.pdf 

10. Talmy, Leonard. 2000. Toward a Cognitive Semantics. Cambridge, Mass. 

The MIT Press,. http://linguistics.buffalo.edu/people/facultv/talmy/talmyweb/TCS.html 


11. Thom, Rene. 1975. Structural Stability and Morphogenesis: An Outline of a 
General Theory of Models. Reading, Mass.: W. A. Benjamin, Inc. 


yuri@ids.net 


What if the words were indeed atoms? 


First published: May 28, 2005 
Last revision: July 19, 2005 











1 


Salt: The Incremental Chemistry of Language 

Acquisition 


Yuri Tarnopolsky 
2005 


Abstract 

This e-paper continues the examination of language as a quasi-molecular system 
from the point of view of a chemist who happens to ask, “What if the words were 
atoms?” Ideas of Pattern Theory (Ulf Grenander) are used as a kind of generalized 
chemistry. The Hungarian folktale A So (Salt) is represented as a sequence of 
syllabic triplets. Small portions of the text are fed to a quasi-chemical reactor 
working according previously described principles of acquisition and 
categorization of generators. The gradual development of categorization and 
aggregates of syllables is illustrated. 



2 


Salt: The Incremental Chemistry of Language 

Acquisition 


Yuri Tarnopolsky 
2005 



Draft 

Last major update: March 2, 2005 


Introduction 

This e-paper directly follows and complements the previous one [1], the introduction, 
literature review, content, and discussion of which will not be repeated here except for 
two notes. 

First, my primary subject of interest is atomistic systems in general, within the 
framework of Ulf Grenander’s Pattern Theory [2,3], which encompasses both molecules 
and utterances, as well as practically everything perceived by senses and/or reason. 

Second, my previous attempt to analyze a fragment of the Hungarian folk tale A 
SO (Salt) [4] in the same manner as The Three Little Pigs , i.e., regarding words as 
generators [1], showed no promise because of the agglutinative nature of Hungarian. 



3 


There were too few words that could be centers for categorization and generator 
acquisition because most functional morphemes stayed appended within the word limits. 
Here I attempt to analyze the same text, taking syllables as generators and gradually 
adding sentences into the focus of attention. 

The choice of text was dictated by its availability on the Web in both text and 
audio fonns, as well as by its cultural origins. The folk tale is a perfect window into the 
language because of its transparency, simplicity of context, universality of human 
experience, and repetitions. The folk tales are relics of the earlier stages of language 
evolution when the complexity of life and ideas did not press hard on the language, 
extruding multilevel sculptures of wired together fragments that needed a long attention 
span and training to understand. The tales correspond to the bygone era when the entire 
society spoke the same language. A folk tale is like a book of one page, so that you do 
not need turning pages in order to follow the plot, while keeping in mind what was on the 
previous pages, now out of sight. The tale is designed to be told, not written. Moreover, it 
is designed for children. Repetition is the mother of learning. 

I am not a speaker of Hungarian. My knowledge of the language is limited to 
superficial familiarity with grammar and some experience with translating into Russian 
the poetry of the highly original, passionate, and innovative Hungarian poet Endre Ady 
(1877-1919). 

I choose Hungarian because it seems to be the opposite of English, has the 
phonetic system of writing, a fixed stress, a very rational, slim, and non-redundant 
grammar, and could not be understood by most readers of this paper, if such be found. 
Therefore, the aspects of structure, which are in the heart of Pattern Theory, as well as 
chemistry, will not be obstructed by cultural and semantic predisposition. With 
Hungarian, the aspects of grammar and alternation will not be too overpowering, as it 
could happen, for example, with much more exuberant Russian or Turkish. 

The Hungarian syllables should be perceived here as small labeled atomistic 
objects capable of fonning linear chains. As a chemist would say, the syllables are 



4 


monomers and the phrases are polymers, while phonemes are the true atoms of language. 
Oligomers, i.e., short linear fragments, are called here blocks. 

I myself, however, cannot be free of bias. I have a personal impression of 
Hungarian as a very elegant, graceful, and beautiful creation of language evolution, 
although modem special texts—probably, in any language—might make a different 
impression. 

Regardless of that, it seems that the fragmentation of speech into syllables is as 
arbitrary as division into words. I feel, for example, a big discomfort when the numerous 
fonns of the noun in Hungarian and Finnish are considered cases because I see the 
endings as just postpositions written together with the root and other markers. 

The problems with hyphenation that arise in languages with long words, such as 
Russian and Hungarian, can be very complicated. The syllable segmentation that I use 
here is arbitrary, but biased by my instinctive desire to bring the syllable as close to the 
morpheme as possible. One cannot know what a morpheme is in an unfamiliar language, 
but I cannot ignore my own knowledge. For example, the old word bdtyamuram, to 
address an older person, literally “my brother my lord”, splits phonetically, with my 
Russian phonetic habits, into ba-tya-mu-ram , but I see its morphologic constituents as 
ba-tyam ur-am (or, baty-am or, splitting the long d, ba-tya-am ). 

I doubt there is the “right” way to segment speech in a written text. The ancient 
scribes did not know the space between the words and modem Chinese does without it, 
being naturally syllabic. Therefore, I just leave it as I choose because I am not interested 
here in the factual tmth, often disputed in linguistics, but in the operational tmth: I don’t 
know what it is, but let us see how “it” behaves. Ultimately, I believe only in 
phonological facts but I am not qualified to represent them. The “facts” of the written 
language are just conventions, more or less reasonable. The sound is a physical fact. 

I inserted the pauses from the available soundtrack. In most cases, the pause 
coincides with a punctuation mark, but not always. The STOP is the full stop. 

I am convinced in the leading role of prosody and pitch in early language 
acquisition by infants, on which there is a large and growing body of solid experimental 
work, off the battlefields of formal linguistics. I mark the stressed syllables by 
capitalization, but I realize that this is somewhat arbitrary, too. Some longer Hungarian 



5 


words have an additional stress and the sentence often starts with a raised pitch. Should I 
mark all monosyllabic words stressed? I simply do not kn ow. To complicate the picture, 
in Hungarian, as in Russian, a single consonant can be a morpheme, which is more 
typical for polysynthetic languages. Such a lean loner may not be a legitimate syllable, 
but it is definitely a generator in the sense of Pattern Theory [2,3]. 

The descent to the level of phonemes is an intriguing task, for which Italian is a 
good medium. Until then, I take the syllable segmentation for granted. Therefore, I do not 
pay attention to the generator identification here, unless some special cases of markers. 


Illustrations and discussion 


The following is a description of an experiment. By no means is it a computer simulation 
of some realistic object. This is only an illustration of principles. THIS IS ONLY A 
TEST. I use computer only for the purpose of representation of the text and sorting out 
the results. The MATLAB output is further easily converted into tables with MS Word 
Table functions. 

If this is an experiment, we have to describe its subject participant. I prefer to call 
the subject robot-child. It is an imaginary system that can be described only 
approximately and vaguely. It possesses tunable memory and attention span and is 
supposed to leam something from the input. 

The strings of syllables are fed into the robot-child’s mind (which I see as a kind 
of a chemical reactor) where they are digested into overlapping triplets. It is important 
that the original string is not remembered, but its stable repetitive fragments, as well as 
new knowledge, could be. 



6 


The memory stores: 

1. The syllables. 

2. Relatively stable bonds of some of the syllables with the neighbors in the 
original strings. 

3. Classes (categories) obtained as result of simple local operations. 

4. Some non-syllabic generators as grammar markers. 

The chemistry of the mind of the robot-child is defined by simple local rules. For 
all the explanations see [1], which is absolutely necessary for understanding this paper. 

At each step, a sentence from the text is added to all the previous ones and the 
total is analyzed as a whole. This is not exactly what happens during the language 
acquisition, but further modifications in the direction of realism are possible. By the 
realism I mean here a fast forgetting of most details of the freshly perceived utterances. 

Step 1 

MATLAB input: Pl=char ( 'STOP', 'volt', 'EDY', 'szer', 'egy', 'O ', 'reg', 'KT, 'raly', 
'PAUSE', ‘s’, 'HA', ’rom’, 'szep', LE', 'any', 'a', 'STOP'); P=P1; 

Table 1 



LEFT 

N 

Generator 

RIGHT 


1 

# 

2 

STOP 

volt 


2 

STOP 

1 

volt 

EDY 


3 

volt 

1 

EDY 

szer 


4 

EDY 

1 

szer 

egy 


5 

szer 

1 

egy 

0 


6 

egy 

1 

0 

reg 


7 

0 

1 

reg 

KI 


8 

reg 

1 

KI 

raly 


9 

KI 

1 

raly 

PAUSE 



10 

raly 

1 

PAUSE 

s 


11 

PAUSE 

1 

s 

hA 


12 

s 

1 

hA 

rom 


13 

hA 

1 

rom 

szep 


14 

rom 

1 

szep 

LE 


15 

szep 

1 

LE 

any 


16 

LE 

1 

any 

a 


17 

any 

1 

a 

STOP 



The numbers in the third column are occurrences. 

The table has no double entries. STOP and PAUSE are ignored. 






























7 


Step 2 


P2 = char( 'az', 'O 'reg', 'KT, 'raly', 'SZE', 'ret', 'te', 'VOL',’naVmind’, 'a', 'HA',’rom', 'LE', 
’any’, 'at', 'FERJ', ’hez', ’AD’, 'ni', ‘STOP’); P = strvcat (PI, P2); 


Table 2 



LEFT 


G 

RIGHT 

1 

a 

3 

STOP 

az 

2 

STOP 

1 

volt 

EDY 

3 

volt 

1 

EDY 

szer 

4 

EDY 

1 

szer 

egy 

5 

szer 

1 

egy 

O 

6 

az, egy 

2 

O 

2-reg 

7 

2-0 

2 

reg 

2-KI 

8 

2-reg 

2 

KI 

2-raly 

9 

2-KI 

2 

raly 

PAUSE, SZE 

10 

raly 

1 

PAUSE 

s 

11 

PAUSE 

1 

S 

hA 

12 

a, s 

2 

hA 

2-rom 

13 

2-HA 

2 

rom 

LE, szep 

14 

rom 

1 

szep 

LE 


15 

rom, szep 

2 

LE 

2-any 

16 

2-LE 

z 

any 

a, at 

17 

mind, any 

2 

a 

hA, stop 

18 

STOP 

1 

az 

0 

19 

raly 

1 

SZE 

ret 

20 

SZE 

1 

ret 

te 

21 

ret 

1 

te 

VOL 

22 

te 

1 

VOL 

na 

23 

VOL 

1 

na 

mind 

24 

na 

1 

mind 

a 

25 

any 

1 

at 

FERJ 

26 

at 

1 

FERJ 

hez 

27 

FERJ 

1 

hez 

AD 

28 

hez 

1 

AD 

ni 

29 

AD 

1 

ni 

STOP 


After Step 2 we see multiple (double, to be exact) entries of some generators into the 
Table. Two “chemical” processes are, therefore, possible: 

1. Class acquisition 

Class (= category, pattern) W requires two or more different neighbors (A, B, C... ) 
on one side of generator X. 

NOTE: Letter W is chosen because it is not in Hungarian alphabet, except for foreign words. 


No. 

A, B, C 

3 

X 

• • • 


No. 

• • • 

3 

X 

A, B, C 


















































8 


X-{A, B, C} => W-X 
{A, B,C} <z> W 

2. Bond A-X acquisition 


{A, B, C}-X => W-X 
{A, B,C} <z> W 


Bond A-X requires two or more occurrences of the same neighbor A of X . 


No 

2-A 

2 

X 

. . . 


No 

. . . 

2 

X 

2-A 


{X , A} <z> X-A 


{A, X } <z> A-X 


Step 2 produces new classes from the following lines of the Table 2 
(*** stands for non-participating generators) : 


6 

az, egy 

2 

6 

'k'k'k 


12 

a, s 

2 

hA 



13 

*** 

2 

rom 

LE, szep 


15 

rom, szep 

2 LE 

2-any 


16 

*** 

2 

any 

a, at 


17 

mind, any 

2 

a 

*** 


W1 { az, egy}-6 
W2 {a,s}-HA 
W3 rom-{LE, szep} 


W4 any- { a, at} 
W5 {mind, any}-a 


Step 2 produces new bonds from the following lines of Table 2: 


6 

*** 

2 

6 

2-reg 

7 

2-6 

2 

reg 

2-KI 

8 

2-reg 

2 

KI 

2-raly 

9 

2-KI 

2 

raly 

*** 


15 

rom, szep 

2 

LE 

2-any 

16 

2-LE 

2 

any 

a, at 







{6 , reg } <=> O-reg ; {reg, KI} <=> reg-KI ; {KI + raly} Kl-raly; 
{ O , reg, KI, raly } <=> O-reg-KI-raly; {LE , any } LE-any; 


9 


Since LE and any belong to classes W3 and W5 other 
equilibriums are possible : 

{LE , any } <=> LE-any; 

W6 rom-{LE, szep} => rom-{ LE-any, szep} 

W7 {mind, any} -a => {mind, LE-any} -a 

The further fate of newly formed bonds and classes depends on the function of 
memory: bonds, as well as classes will fade away unless strengthened by repetitions. The 
extremely limited and repetitive infant-addressed speech is essential for language 
acquisition. The limited environment of the infant (the blessing of the poverty of 
stimulus) satisfies this condition. 

The folk tale like Salt is suitable for language expansion but not acquisition by 
infants. The child-addressed speech has been a subject of detailed investigation in many 
languages and its properties—almost no syntax, short phrases, pauses, and repetitions— 
make me think that if there is anything innate in language acquisition it is the Motherese, 
which sounds very much as Nean. For an introductory but rich in detail review, see [5]. 
Some Russian tales for children are very simple and repetitive, for example Kolobok 
(“Round Bun”) [6], which is one of the very first tales read to children in Russia. 

It is easy to see that if the steps of processing Salt are continued, the content of the 
acquiring mind will soon become complicated. Its complexity does not exceed, however, 
that of grammar, which is easily manageable by a young mind. It is important to 
understand that such complex equilibriums are typical for chemical systems. They would 
completely paralyze all practical chemistry, as well as biochemistry of life, if not for one 
circumstance: most of them are absolutely negligible because either an equilibrium is 


10 


shifted practically toward one end or either the establishing of most equilibriums takes a 
very long, sometimes astronomical, time. In chemical reality, the content of the flask or 
living cell is processed before the equilibrium establishes, so that only a few fast-fonning 
products are present. Life and industrial chemistry never come to equilibrium. Even in 
wine-making, achieving equilibrium during the maturation of the wine is just a costly 
dream. 

I would dogmatize the principles of kinetics in human sciences in the following 

way: 

We say what can be said faster. 

We understand what can be understood faster. 

We do what can be done faster. 

Examples: we say stupid things, fear mathematics, and marry a wrong person. 

The next question is how we can compute such systems where kinetics and not 
thermodynamics rules. They are not the same as what is understood by dynamical 
systems. As I believe, this is where the key to understanding the mind can be found. 

I suspect that computational chemistry has made some progress in this area, 
which is outside my expertise, but I am not familiar with the literature at this point. I 
leave this question open, adding only that I believe that Ulf Grenander’s GOLEM [3] is 
the right starting point because of its chemical properties. The only idea I would add is 
the kinetics. 

It is not so difficult to program the succession of steps, but it is more difficult to 
include the kinetics because of problems with the structure of transition state. It is 
possibly complicated, but not hopeless. I am not prepared, however, at this stage to 
approach the entire problem of computation. The molecules compute the state of the 
system quite effectively, I can see how they do it, but without a new parallel hardware it 
would take astronomical time to simulate it on PC. In computational chemistry this type 
of problems is represented by protein folding. 

The information about the state of the system, kinetic or not, can be fully 
represented by the Q and A matrix of Ulf Grenander [2,3]. Honestly, I am afraid that the 



computation can only obscure the simplicity of basic ideas. In order to better illustrate 
them incrementally, I am adding the next step. 


11 


Step 3 


P3 = char ('ez', ’nem’, 'is', 'lett', 'VOL', 'na', 'NE', ’hez', 'inert', 'HA', 'rom', 'OR', 'szag', 'a', 
'volt', 'PAUSE', 'mind', 'a', 'HA', 'rom', 'LE', 'any', 'a', 'ra', 'JUT', 'ott', 'EDY', 'egy', 'OR', 


'szag', ‘STOP’); P = strvcat (PI, P2, P3); 


Table 3 


1 

a, ni 

4 

STOP 

az, ez 

2 

STOP, a 

2 

volt 

EDY, 

PAUSE 

3 

ott, volt 

2 

EDY 

egy, szer 

4 

EDY 

1 

szer 

egy 

5 

EDY, 

szer 

1 

egy 

OR, 0 

6 

az, egy 

1 

0 

2-reg 

7 

2-0 

2 

reg 

2-KI 

8 

2-reg 

2 

KI 

2-raly 

9 

2-KI 

2 

raly 

PAUSE, 

SZE 

10 

raly, volt 

2 

PAUSE 

mind, s 

11 

PAUSE 

1 

s 

HA 

12 

2-a, inert, 

s 

4 

hA 

4-rom 

13 

4-HA 

4 

rom 

2-LE, 

OR, szep 

14 

rom 

1 

szep 

LE 

15 

2-rom, 

szep 

3 

LE 

3-any 

16 

3-LE 

3 

any 

a, a, at 

17 

2-mind, 
szag, any 

4 

i 

2-HA, 

STOP, 

volt 

18 

STOP 

1 

az 

0 

19 

raly 

1 

SZE 

ret 


20 

SZE 

1 

ret 

te 

21 

ret 

1 

te 

VOL 

22 

lett, te 

2 

VOL 

2-na 

23 

2-VOL 

2 

na 

NE, mind 

24 

PAUSE, 

na 

2 

mind 

2-a 

25 

any 

1 

at 

FERJ 

26 

at 

1 

FERJ 

hez 

27 

FERJ 

1 

hez 

AD 

28 

hez 

1 

AD 

ni 

29 

AD 

1 

ni 

STOP 

30 

STOP 

1 

ez 

nem 

31 

ez 

1 

nem 

is 

32 

nem 

1 

is 

lett 

33 

is 

1 

lett 

VOL 

34 

na 

1 

NE 

hez 

35 

NE 

1 

hez 

mert 

36 

hez 

1 

mert 

HA 

37 

egy, rom 

2 

OR 

2-szag 

38 

2-OR 

2 

szag 

STOP, a 

39 

any 

1 

a 

ra 

40 

a 

1 

ra 

JUT 

41 

ra 

1 

JUT 

ott 

42 

JUT 

1 

ott 

EDY 


We have to add to Table 3 the bonds and classes generated in the previous step: 




















































12 


W1 {az, egy}-6 

W2 {a,s}-HA 

W3 rom-{LE, szep} 

W4 any-{a, at} 

W5 {mind, any}-a 


W6 rom-{LE, szep} 

=> rom-{ LE-any, szep} 

W7 {mind, any} -a 

=> {mind, LE-any} -a 

{6 , reg } O-reg ; 

{reg, KI} <=> reg-KI ; {KI + raly} <=> Kl-raly; 

{ 0 , reg, KI, raly } <=> O-reg-KI-raly; {LE , any } <=> LE-any; 


The new material in Table 3 is highlighted yellow. New bonds and classes can be 
extracted from the following lines, which can be characterized as relevant novelty: 


6 

az, egy 

2 

6 

2-reg 


12 

2-a, inert, 

s 

4 

HA 

4-rom 

13 

4-HA 

4 

rom 

2-LE, 

OR, szep 


17 

2-mind, 
szag, any 

4 

a 

2-HA, 

STOP, 

volt 


22 

lett, te 

2 

VOL 

2-na 

23 

2-VOL 

2 

na 

NE, mind 


37 

egy, rom 

2 

OR 

2-szag 


New bonds: 

{VOL-na} <=> VOL-na 
{OR, szag } <l> OR-szag 































13 


Expansion of old classes: 

W1 {a, mert, s }-HA 
W3 rom-{LE, szep, OR} 

W5 (mind, any, szag}-a 

New classes: 

W8 {az, egy}- O 

W9 a-{ HA, volt} 

W10 {lett, te}-VOL 
Wll na-{NE, mind} 

Some of new equilibriums: 

{mind} <=> mind-a-HA-rom-LE-any (“all three girls”) 

{ HA } <=> HA-rom-OR-szag 
{KI-raly} <^> az-O-reg-KI-raly 

I intentionally do not interpret the classes, but a linguist even very superficially 
familiar with Hungarian will note that egy and az in W8 make up for two out of three 
Hungarian fonns of the article (egy, a, and az). In the subsequent text, KI-raly will bind 
all articles in a class and, therefore, all nouns and adjectives will be bound in class by the 
articles. This creeping and crawling process of triangulation, which in the eyes of a 
chemist is nothing but catalysis, seems to be the essence of language acquisition and 
growth in general. I would compare it with a group of mountain climbers who help and 
safeguard each other. 



14 


I will illustrate the main idea of the “chemical” approach to language acquisition 
from a slightly different angle, with relation to [7]. I repeat Table 2 with additional 
columns that indicate only bonds and classes of syllables. Thus, generator 0,(1) forms 
a bond with -reg on the right and (2) denotes a class including az and egy on the left. 
Generator any (1) forms a bond with LE on the left and (2) denotes a class including a 
and s on the right. Since we are in the very beginning of acquisition, the three last 
columns do not contain any new information, but with next steps the new and more 
general classes, for example, what we call Article of Noun, can be added. 


Table 2A 



LEFT 


G 

RIGHT 

BOND 

Class UP 

Class DOWN 

1 

a 

3 

STOP 

az 




2 

STOP 

1 

volt 

EDY 




3 

volt 

1 

EDY 

szer 




4 

EDY 

1 

szer 

egy 




5 

szer 

1 

egy 

O 




6 

az, egy 

2 

O 

2-reg 

-reg 

{az-, egy-} 


7 

2-0 

2 

reg 

2-KI 




8 

2-reg 

2 

KI 

2-raly 

-raly 



9 

2-KI 

2 

raly 

PAUSE, SZE 

KI- 



10 

raly 

1 

PAUSE 

s 




11 

PAUSE 

1 

S 

hA 




12 

a, s 

2 

hA 

2-rom 

-rom 

{a-, s-} 


13 

2-HA 

2 

rom 

LE, szep 

hA- 


{-LE, -szep} 

14 

rom 

1 

szep 

LE 




15 

rom, szep 

2 

LE 

2-any 


{rom-, szep-} 


16 

2-LE 

2 

any 

a, at 



{-a, - at} 

17 

mind, any 

2 

a 

hA, STOP 


{mind-, any-} 


18 

STOP 

1 

az 

0 




19 

raly 

1 

SZE 

ret 




20 

SZE 

1 

ret 

te 




21 

ret 

1 

te 

VOL 




22 

te 

1 

VOL 

na 




23 

VOL 

1 

na 

mind 




24 

na 

1 

mind 

a 




25 

any 

1 

at 

FERJ 




26 

at 

1 

FERJ 

hez 




27 

FERJ 

1 

hez 

AD 








































15 


28 

hez 

1 

AD 

ni 




29 

AD 

1 

ni 

STOP 





l ,. .... :/r~\ 

Let us consider all possible arrangements of three generators: a, HA , and rom. 

1. a- HA-rom 2. a-rom-HA 3. HA-rom -a 

4. HA-rom -a 5. rom-a-HA 6. rom-HA-a 

r \ , -yi 

Only triplet a- HA-rom contains both regular bond HA-rom and the regular 

r 

class a- H A . Therefore, the transition state from thought to utterance that contains this 
triplet has the lowest energy among six triplets and the utterance that includes is has the 
highest chance to be generated at normal conditions. 

Along the same principles, it follows from Table 2A (but not from the final state 
of language knowledge) that Fragment 1 below has better chances than Fragment 2. In 
both fragments, HA-rom (“three”) and LE-any (“girl”) are considered relatively stable 
generators. 

Fragment 1: [a]—[HA-rom]—[LE-any] .Fragment 2: [LE-any]—[a]—[HA-rom]. 


I balk at the next steps of acquisition before the process becomes too cumbersome, 
but here is the table for the next step: 


Step 4 

P4 = char ( 'HA', 'nem', 'A', 'hogy', 'an', 'nines', 'HA', 'rom', 'EDY', 'for', 'ma', 'AL', 'ma', 
'PAUSE', 'ugy', 'a', 'HA', 'rom', 'OR', 'szag’, 'sem', 'volt', 'EDY', 'for', 'ma', ‘STOP’); 
P=strvcat (P2, P3, P4, P5); 

Table 4 


1 

a, ni, szag 

5 

STOP 

HA, az, ez 

2 

STOP, a, sem 

3 

volt 

2-EDY, PAUSE 

3 

ott, rom, 2-volt 

4 

EDY 

egy, 2-for, szer 

4 

EDY 

1 

szer 

egy 

5 

EDY, szer 

2 

egy 

OR, 0 

6 

az, egy 

2 

0 

2-reg 

7 

2-0 

2 

reg 

2-KI 

8 

2-reg 

2 

KI 

2-raly 




























16 


9 

2-KI 

2 

raly 

PAUSE, SZE 

10 

ma, raly, volt 

3 

PAUSE 

mind, s, ugy 

11 

PAUSE 

1 

s 

hA 

12 

3-a, mert, nines, s 

6 

hA 

6-rom 

13 

6-KA 

6 

rom 

EDY, 2-LE, 2-OR, szep 

14 

rom 

1 

szep 

LE 

15 

2-rom, szep 

3 

LE 

3-any 

16 

3-LE 

3 

any 

a, a, at 

17 

2-mind, szag, any, ugy 

5 

a 

3-HA, STOP, volt 

18 

STOP 

1 

az 

O 

19 

raly 

1 

SZE 

ret 

20 

SZE 

1 

ret 

te 

21 

ret 

1 

te 

VOL 

22 

lett, te 

2 

VOL 

2-na 

23 

2-VOL 

2 

na 

NE, mind 

24 

PAUSE, na 

2 

mind 

2-a 

25 

any 

1 

at 

FERJ 

26 

at 

1 

FERJ 

hez 

27 

FERJ 

1 

hez 

AD 

28 

hez 

1 

AD 

ni 

29 

AD 

1 

ni 

STOP 

30 

STOP 

1 

ez 

nem 

31 

HA, ez 

2 

nem 

A, is 

32 

nem 

1 

is 

lett 

33 

is 

1 

lett 

VOL 

34 

na 

1 

NE 

hez 

35 

NE 

1 

hez 

mert 

36 

hez 

1 

mert 

hA 

37 

egy, 2-rom 

3 

OR 

3-szag 

38 

3-OR 

3 

szag 

STOP, a, sem 

39 

any 

1 

a 

ra 

40 

a 

1 

ra 

JUT 

41 

ra 

1 

JUT 

ott 

42 

JUT 

1 

ott 

EDY 

43 

STOP 

1 

HA 

nem 

44 

nem 

1 

A 

hogy 

45 

A 

1 

hogy 

an 

46 

hogy 

1 

an 

nines 

47 

an 

1 

nines 

hA 

48 

2-EDY 

2 

for 

2-ma 

49 

AL, 2-for 

3 

ma 

AL, PAUSE, STOP 

50 

ma 

1 

AL 

ma 

51 

PAUSE 

1 

ugy 

a 

52 

a 

1 

sem 

volt 


We can use a bigger or even complete text if we unrealistically assume that it all 
can be kept in the focus of attention (called window in psycholinguistics). This is 
impossible with real children but possible with robot-child. One practical implementation 
of language chemistry can be the actual simulation of robot-child and its learning. Like 
the wheel, it may be unnatural, but practical. 




17 


I emphasize that the entire concept is no more than a hypothesis and it needs 
a lot of further confirmations of its validity, which could be my next task. 

The essence is, in very general terms, that the child creates its own grammar 
which is constantly and significantly updated by the input in an incremental manner. In 
other words, grammar is a kind of a biological species, very primitive in the beginning, 
which evolves in the environment of input and an exchange with environment into its 
final complex form which remains individual. The mechanism of this evolution consists 
of simplistic and somewhat mechanical rules. In this sense, the child possesses a real 
Language Acquisition Device, as Chomsky prophetically called it, which works until the 
active learning takes off. The LAD quite mechanically (i.e., “chemically”) generates 
hypotheses about the grammar (i.e., regularity of PT) for further tests. 

Thus, at the initial stage, if the entire input is nothing but a single tale about salt, 

r 

the child-robot, probably, knows that HA-rom (“three”) is a separate word because it 
occurs in different environments, but perceives O-reg-KI-raly (“old king”) as a single 
word because there is no evidence to the contrary. 

The complete text [4], translation, and its complete Table 4 are given in the 
APPENDIX for those who would like to see for themselves how much grammar can be 
extracted from the triplets. Obviously, not all of it, but quite a lot, which will be enough 
for telling simple tales in a simple language. 

Next I will select some lines from Table 4 and expand them, as before, with 
columns Bond (B), Class UP (CUP), Class Down (CDN), and New G(NG) , filled out 
manually in order to provide more illustrations. This time I will give some translations. 



LEFT 


G 

RIGHT 

Bond 

CUP 

CDN 

NG 

2 

STOP, a, ez, kos, lan, 
len, meg, nem, sem, szep 

10 

volt 

AZ, 2-EDY, KU, pause, 
stop, TER, a, csak, mind 

volt- 

EGY 





Line 2. Since voIt-EGY occurs more than once, formally, there is a bond between 
them. This is a hypothesis the child-robot’s LAD has to make. 




18 



LEFT 


G 

RIGHT 

Bond 

CUP 

CDN 

NG 

3 

STOP, ig, meg, ott, 
rom, ta, 2-volt 

8 

EDY 

egy, et, 2-for, 
ma, 3-szer 

EDY-for- 

ma, 

EDY-szer 




49 

AL, EDY, Meg, 2-for 

5 

ma 

AL, PAUSE, 
STOP, ga, is 

forma 





Line 3. Since there is a hypothetical bond EGY-szer (“once”) , Lines 2 and 3 
imply a block volt-EGY-szer. This hypothesis could be further rejected, but in the 
context of the tale it is justified: “there was once...” is a standard beginning of a folk tale. 

Similarly, Lines 3 and 49 imply EDY-for-ma (“equal,” “same”) ,as a block, and 
this hypothesis is correct. Other examples of bond fonnation can be seen in Table 4. No 


in Line 223 is an interjection, naturally, between two pauses. 



LEFT 


G 

RIGHT 

Bond 

Class 

Up 

Class 

Down 

NG 

53 

2-PAUSE, 

STOP, szer, 2- 

en 

6 

azt 

FEL, HAL, 3- 
MOND, ROS 

aztMONDta 




54 

3-PAUSE, 

STOP, 3-azt 

7 

MOND 

jad, 5-ta, tarn 

MONDta 




55 

HAGY, JAR, 5- 
MOND, TISZ, 
TUD, ad, dol, 
fog, 2-lat, nal, 
tol, az 

17 

ta 

BUZ, EDY, 

MA, MEG, 
SZOM, 7-a, 
az, hogy, lak, 
mar, szo 


V-ta 

(V= 

verb) 



80 

Azt, 3-PAUSE, 
ged, nem 

6 

KER 

6-dez 

KERdez 




81 

6-KER 

6 

dez 

lek, 4-te, tern 

KERdezte 




86 

a, ket 

2 

GA 

2-lamb 

GAlamb 




180 

2-MEG 

2 

lat 

2-ta 

MEGlatta 




223 

3-STOP 

3 

No 

3-PAUSE 

PAUSE-No- 

PAUSE 





Some of the bonds will survive, other will not. The grammar that the child-robot 
builds looks like a kind of a living evolving ecosystem rather than an artificial zoo of 
species. 





19 



LEFT 


G 

RIGHT 

Bond 

Class 

Up 

Class 

Down 

NG 

17 

STOP, 2-ban, de, 
2-dult, 2-ek, 2-em, 
ett, i, ik, ja, jart, 
jott, lamb, ment, 2- 
mind, mint, nak, ra, 
2-raly, ralyt, szer, 
szett, szag, sirt, 7- 
ta, 6-te, tak, tek, 
volt, 3-any, ugy 

50 

a 

2-FI, GA, 3-HA, 
KAN, KEZ, 12- 
KI, KO Z, 6-LE, 
4-LEG, 2-PA, 

PE, PIL, RU, 2- 
STOP, SZEL, 
TISZ, TOB, VA, 
i, 6-sot, volt 

a-XYZ... 


-XYZ 


21 

TET, 4-dez, el, 2- 
et, get, hat, ret, 
szelsz 

12 

te 

MIND, 2- 
PAUSE, STOP, 
TOL, VOL, 6-a 

KERdezte- 

a... 

...xyz- 

te 




Line 17. Hungarian a /az is the definite article. It is, as the mathematicians say, 
“degenerated” by having too many possible neighbors. Its function seems vague. But 
while most of the neighbors on the right are stressed (capitalized) syllables, i.e., 
beginnings of words, those on the left are, remarkably, all unstressed. 

NOTE: sot is the objective case of so, “salt.” I did not capitalized it because of 
the inconsistency of my hyphenation rules, but it is a noun. 

The hypothesis is that there is a bond between article a and any stressed syllable 
denoted as XYZ. It also follows that the definite article forms a Class Down, i.e., a class 
of all words that follow it. A linguist would call it the class of nouns, but the child does 
not know linguistics. In the beginning of acquisition, morpheme i and verb volt (“was”) 
seems to be included in the class erroneously. This is because morpheme a is not only an 
article but also a multi-functional marker for verbs and nouns. The vocabulary of the tale 
is not sufficient to make some correct grammatical assignments. Other examples with 
XYZ or unstressed xyz can also be found in the Table. 


Line 

LEFT 


G 

RIGHT 

Bond 

Class 

Up 

Class 

Down 

New 

G 

18 

PAUSE, STOP, 
ban, csak, el, en, 2- 
hogy, ki, inert, 
mint, meg, ra, raly, 
sot, ta, ve, lit, irt, 
ult 

20 

az 

AJ, AP, AR, 
ASZ, EB, 3- 
EM, OR, 

UD, ED, ET, 
8-0 

az- 

XYZ... 

X=vowel; 

azOreg... 


-XYZ 






20 


60 

PAUSE, 4-a, raly 

6 

LEG 

i, job, 2-kis, 
nagy, szebb 

a LEG 
kis? 




117 

5-kis 

5 

asz 

4-szony, 

szonyt 




-t 

220 

et, het, tet 

3 

tek 

EGY, 

PAUSE, a 


xyz-t- 

tek 


-t 

95 

az, 3-es 

4 

AP 

2-ad, 2-am 

APad, 

APam 


AP- 

(ad, 

am} 


96 

2-AP, BATY 

3 

am 

2-PAUSE, 

UR 


XYZ- 

am 




Line 18. Definite article a takes the form az before vowels. Due to the limitation 
of the current vocabulary, a hypotheses about the block az-O-reg-KI-raly (“the old king”) 
is fonned, which is true only within the context of this tale, together with the correct bond 
az-XYZ... , where X = vowel. 

At the same time, the article a/az creates a large class of syllables belonging to 
what we call nouns, numerals, and adjectives, so that the acquisition device rather early 
marks the category of nouns and adjectives. 

Line 60. LEG (Superlative morpheme) forms bonds with kis (“little, small”), 
which, in turn, can be extended to either asz (assz), bonded further with szony or ebb. 

Therefore, kis can be a component of either block LEG-kis-ebb , “the youngest” 
or block Kl-raly-kis-asz-szony , “princess” (literally, king-little-lady). How can the 
child-robot decide which? It depends on what other components are there in the focus. In 
the tale, the choice is unambiguous because ebb and asszony are not encountered 
between two stops. 

Line 95. AP-ad , (“your father”) an AP-am (“my father”) create a class -{- ad, - 
am} , which itself creates class XYZ-{- ad, - am} if more XYZ with ad and/or am 
are encountered. This nouns and possessive endings are thereby bonded as two classes. 

Line 117. The fact that asszony (“woman, lady”) and asszonyt (Object Case) 
differ in a sound implies that t can be a generator, although it is not a syllable. And in fact, 
t is an important multifunctional marker in Hungarian. See also Line 220 . 

Those are some of many bits of grammar that the robot-child can acquire from the 
Tale of Salt. 


















21 


As I hinted in previous e-papers [1], the language acquisition is just one case of 
the entire class of processes of growth. Knowledge acquisition, for example, by scientific 
means, is another one. The child-robot works by building and updating hypotheses, as the 
scientist does. Biological mutations are hypotheses of a kind, too. Both biological and 
language-acquisition mutations do not involve any mind. 

I assume that the pattern model of the mind can be built incrementally, along the 
principles of simplicity [1], in the same way a child acquires language or builds a Lego 
palace, i.e., with almost no thinking involved. When some intennediate construct seems 
unstable (contradictive or counterproductive, i.e. working against homeostasis), an 
improvement is made by trial and error. Whatever I have presented regarding language, 
however, does not take to account homeostasis. The actual building of the model of the 
mind will naturally include it. 

In general, three general ideas, none of them new, seem to emit guiding light: 

1. Language acquisition is a counterpart of biological evolution of species and 

natural evolution of knowledge. 

2. Language acquisition device (LAD) is really a device, which, like any device, 

works as a mechanism, even though somewhat lax and wobbly. 

3. Language acquisition is like packing a parachute: if some sections do not go 

first, and other last, it will not open. This is the essence of bootstrapping. 


Further work with Salt and other objects 

It is obvious that my manual analysis of triplet tables is subjective and cumbersome. A 
computer code for incremental simplistic learning from syllabic input is needed. Can the 
code itself be simple? How far can we go with simplicity? Do we need forgetting in order 



22 


to learn? Etc. Ultimately, can we construct an interface between thought and speech, 
working on principles of kinetics? A next step in my program is to explore the transition 
state. The question is: how can a semantic configuration be squashed into the line with 
the help of a triplet grammar? As a chemist, I clearly see the whole mechanism, but to 
present it to non-chemists as a model is a challenge. 


Questions, suggestions and shattering criticism are welcome at: 
EMAIL: http://spirospero.net/email.html 


References 

See also http://spirospero.net/complexity.htm 

1. Yuri Tamopolsky (2005). The Three Little Pigs : Chemistry of language acquisition. 

http://spirospero.net/3LP.pdf 

2. Grenander, Ulf. Elements of Pattern Theory. Baltimore: Johns Hopkins University 

Press, 1995. 

3. -. Patterns of Thought. www.dam.brown.edu/ptg/REPORTS/mind.pdf 

Watch for updates of Patterns of Thought. 

4. A so . www.magyarora.com/literature/Benedek so.pdf 

From the site: http://www.magyarora.com/english/index.html 

http://www.magyarora.com/english/literature.html 
Audio: http://www.magyarora.com/phonetics/magyarora literature benedek.rm 

5. Robert E. Owens, Jr. (1992). Language Development: An Introduction. New York: 

Macmillan. 

6. Kolobok. http://www.sunbirds.com/lacquer/readings/1203 

7. Yuri Tamopolsky (2004). Tikki Tikki Tembo: The Chemistry of Protolanguage. 


http://spirospero.net/Nean.pdf 













23 


APPENDIX 


The approximate pronunciation of some consonants is: gy = d in “due”, j and ly = y in 
“you,” s = sh, sz = s, and cs = ch. The short vowels are: a, e, i , o, u, 6 , and ii; they 
all sound, approximately, as in Gennan. The long vowels are A a, E e, If, 6 6, U u, 

rr rr 

and O 6 . The latter, O 6 , which is not rendered by MATLAB, will be alternatively 
denoted 0,6. The stress and the length of the vowels are independent, which gives the 
Hungarian language its characteristic syncopated melody. The stress is on the first 
syllable, but what is a syllable? 1 don’t hear it as carved in stone. 


A So 

The transcription of the folktale by Elek Benedek [5] . 

Volt egyszer egy oreg kiraly s harom szep leanya. Az oreg kiraly szerette volna mind a 
harom leanyat ferjhez adni. Ez nem is lett volna nehez, inert harom orszaga volt, mind a 
harom leanyara jutott egy-egy orszag. Hanem ahogyan nines harom egyforma alma, ugy 
a harom orszag sem volt egyfonna. 

Azt mondta egyszer a kiraly a leanyainak, hogyannak adja a legszebb orszagat, amelyik 
ot legjobban szereti. 

- Felelj nekem, edes leanyom, hogy szeretsz engem? - kerdezte a legidosebbiket. 

- Mint a galamb a tiszta buz at - mondta a leany. 

- Hat te, edes leanyom? - kerdezte a kbzepsot. 

- En ugy, edesapam, mint forro nyarban a szellot. 

- No, most teged kerdezlek - fordult a legkisebbikhez -, mondjad, hogy szeretsz? 

- Ugy, edesapam, ahogy az emberek a sot! - felelte a kicsi kiralykisasszony. 

- Mit beszelsz, te! - formedt ra a kiraly. - Ki az udvarombol, de meg az orszagombol is! 
Ne is lassalak, ha csak ennyire szeretsz! 

Hiaba sirt a kiralykisasszony, hiaba magyarazta, hogy az emberek szeretik a sot: vilagga 
kellett hogy menjen. 



24 


Elindult a kicsi kiralykisasszony sirva, s beert egy nagy erdobe. Onnan nem is ment 
tovabb, ott eltegy darabig egymagaban. 

Egyszer, mikor mar egy esztendo is eltelt, arra jarta szomszed kiralyfi, s meglatta a 
kiralykisasszonyt. 

Megtetszett a kiralyfinak a kiralykisasszony, inert akarmilyen piszkos volt a ruhaja, szep 
volt, kiilonosen az area. Szepen megfogta a kezet, hazavezette a palotajaba, s ket hetet 
sem vart, de meg egyet sem, de talan meg egy orat sem, esmegeskudtek. 

A liatal par bekesen elt, ugy szerettek egymast, mint ket galamb. Egyszer azt mondta a 
kiraly: 

- No, feleseg, amikor eloszor meglattalak, nem kerdeztem, miert kergetett el az apad. 
Mondd meg nekem a valosagot! 

- Azt kerdezte tolem, hogy szeretem ot, s en azt feleltem: mint az emberek a sot. 

- Jol van, majd csinalok en valamit, tudom, megszeret ujra az edesapad - mondta a kiraly. 
S azzal levelet irt az oreg kiralynak, s abban meghivta ebedre. El is ment a level masnap, 
s harmadnap jott a kiraly. Folvezette a liatal kiraly az oreg kiralyt a palotaba. Ott mar 
meg volt teritve az asztal ket szemelyre s leiiltek. No, ez volt csak az ebed! Megkostolta 
az oreg kiraly a levest, de le is tette mindjart a kanalat, nem tudta megenni, olyan sotlan 
volt. Gondolta magaban az oreg kiraly: ebbol bizony kifelejtettek a sot, de a tobbi etelben 
majd csak lesz. De nem volt azokban sem. Hordtak a pecsenyeket, de vissza is vihettek, 
inert az oreg kiraly bele sem harapott, olyan sotlan, izetlen volt mind. De ezt mar nem 
hagyta szo nelkiil: 

- Hallod-e, ocsem, hat milyen szakacsod van neked, hogy so nelkiil siit-foz? - kerdezte. 

- Soval siit-foz ez maskor mindig, de en azt hallottam, hogy batyamuram nem szereti a 
sot, azt mondtam hat neki, hogy ne tegyen sot az etelekbe. 

- No, azt rosszul tetted, inert en nagyon szeretem a sot. Kitol hallottad, hogy nem 
szeretem? 

- A leanyatol - mondta a liatal kiraly. Abban a pillanatban kinyilt az ajto, es belepett a 
kiralyne, az oreg kiraly legkisebb leanya. Hej, istenem, oriilt az oreg kiraly! Mert sajnalta 
mar nagyon, hogy elkergette a leanyat. Most neki adta a legnagyobb orszagat. 

Meg ma is elnek, ha meg nem haltak. 



25 


Translation: 

Salt 

Once upon a time there lived an old king with his three beautiful daughters. The old king 
wanted all three his daughters to get married. It would not be difficult because he had 
three lands, one for each daughter. But as there are no three apples alike, so there are no 
three lands alike. And so the king told his daughters that he will give the best kingdom to 
the daughter who loves him most. 

“Tell me, my dear daughter, how do you love me?” he asked the oldest. 

“As a dove loves pure grain,” answered the eldest. 

“And you, dear daughter?” He asked the middle one. 

“And I love you as much as a breeze in hot summer.” 

“Now I am asking you, he turned to the youngest, tell me how do you love me?” 

“As much, dear father, as people love salt,” answered the little princess. 

“What are you saying!” The king shouted. “Off my court, and get out of my land. 
I don’t want to see you if you love me that much.” 

In vain the princess cried, in vain explained how people like salt: she had to go. 
The little princess left in tears and came to a big forest. She did not go any further, she 
lived there completely on her own. 

Once , when already a year had passed, a neighboring prince came there and saw 
the little princess. The prince was struck by the princess because however shabby her 
dress was, she was beautiful, especially, her face. He gently took her hand and brought 
her to his palace, and not in two weeks, not even one, maybe not even in an hour they got 
married. 

The young couple lived in love like two doves. 

One day the king said, “Well, my wife, when I first saw you, I did not ask why 
your father banished you. Tell me the truth!” 

“He had asked me how I loved him and I answered: as people love salt.” 



26 


“Good, then I know how to do something so that you father will love you again,” 
said the king. And he wrote a letter, inviting the old king for dinner. Next day the letter 
went out and after another day the king arrived. The young king invited the old king into 
his palace. There already was a table set for two and they sat down. But what a dinner it 
was! The old king tried the soup but had to put down the spoon at once and could not eat 
because it was without salt. The old king thought for himself: here certainly the salt was 
forgotten, but probably not in other food. But there was no salt either. The roast was 
brought in but taken back because the old king could not take a bite, so saltless and 
tasteless everything was. But he already lost his patience. 

“Listen, young man, what kind of cook do you have that cooks without salt?” He 

asked. 

“He always cooks with salt in other times but I heard that you, sir, do not like salt, 
so I said to him not to salt food.” 

“Well, this is very bad that you did so because I like salt very much. From whom 
did you hear that I do not like it?” 

“From your daughter,” said the young king. 

At this moment, the door opened and the queen , the youngest daughter of the old 
king, entered. Good Lord, how the old king rejoiced! Now he regretted very much having 
chased the girl away. Now he gave her the largest land. 

They are still living, if have not died. 


Input string for MATLAB 


P=char ('STOP', 'volt', 'EDY', 'szer', 'egy', 'O', 'reg', 'Kl', 'raly', 'PAUSE', 's', 'HA, 
’rom', 'szep', 'LE', 'any', 'a', 'STOP', 'az\ 'O', 'reg', 'Kl', 'raly', ... 

'SZE','ret','te','VOL','na','mind', 'a', 'HA'.'ram', 'LE', 'any', 'at', 'FERJ', 'hez', ... 
'AD', 'ni\ 'STOP', 'ez', 'nern', 'is', 'lett', 'VOL', 'na', 'NE\ 'hez', ’mert’, 'HA, ... 

'rom', 'OR', 'szag', 'a', 'volt', 'PAUSE', 'mind', 'a', 'HA', 'rom', 'LE', ... 

'any', 'a', 'ra', 'JUT', 'otf, 'EDY', 'egy', 'OR', 'szag', 'STOP', 'HA', ’nem', 'A', ... 
'hogy', 'an', 'nines', 'HA', 'rom', 'EDY', 'for', 'rna', 'AL', 'ma', 'PAUSE', 'ugy', 'a', ... 
'HA, 'rom', 'OR', 'szag', 'sem', 'volt', 'EDY', 'for', 'ma', 'STOP', ... 

■azt', 'MOND', 'ta', 'EDY', 'szer', 'a', 'Kl', 'raly', 'a', 'LE', 'any', 'a', 'i', 'nak', ... 

'hogy', 'AN', 'nak', 'AD', 'ja', 'a', 'LEG', 'szebb', 'OR', 'szag', 'at', 'PAUSE', 'A', ... 
'rnely', 'ik', 'ot', 'PAUSE', 'LEG', 'job', 'ban', 'SZER', 'e', 'ti', 'STOP', 'FEL', 'elj', ... 



27 


'NEK', 'em', 'PAUSE', 'ED', 'es', 'LE', 'any', 'om', 'PAUSE', 'hogy', 'SZER', 'etsz', ... 
'EN\ 'gem', 'PAUSE', 'KER', 'dez', 'te', 'a', 'LEG', 'i', 'do', 'sebb', 'ik', 'ef, ... 

'STOP', 'mint', 'a', 'GA\ 'lamb', 'a', 'TISZ', 'ta', 'BUZ', 'at', 'PAUSE', 'MOND', ... 

■ta’, 'a', 'LE', 'any', 'STOP', 'hat', 'te', 'PAUSE', 'ED', 'es', 'LE', 'any', 'om', 'PAUSE', ... 
'KER', 'dez', 'te', 'a', 'KOZ', 'ep', 'sot', 'STOP', 'en', 'ugy', 'ED', 'es', 'AP', ... 

'am', 'PAUSE', 'mint', 'FOR', To', 'NYAR', 'ban', 'a', 'SZEL', 'lot', 'STOP', 'no', ... 
'most', 'TE', 'ged', 'KER', 'dez', 'lek', 'PAUSE', 'FOR', 'dult', 'a', 'LEG', 'kis', 'ebb', ... 
'ik', 'hez', 'STOP', 'MOND', 'jad', 'hogy', 'SZER', 'etsz', 'STOP', 'ugy', 'ED', ... 

'es', 'AP', 'am', 'PAUSE', 'A', 'hogy', 'az', 'EM', 'ber', 'ek', 'a', 'sot', 'PAUSE', 'FEL', ... 
'el', 'te', 'a', 'Kl\ 'csi', 'Kl', 'raly', 'kis', 'asz', 'szony', 'STOP', 'mit', 'BE', ... 

'szelsz', 'te', 'PAUSE', 'FOR', 'medt', 'ra', 'a', 'Kl', 'raly', 'STOP', 'ki', 'az', ... 

'UD', 'var', 'om', 'bol', 'de', 'meg', 'az', 'OR', 'szag', 'om', 'bol', 'is', 'PAUSE', ... 

'STOP', 'ne', 'IS', 'las', 'sa', 'lak', 'PAUSE', 'ha', 'csak', 'EN', 'nyi', 're', 'SZER', ... 
'etsz', 'STOP', 'HI', 'a', ’ba', ’slrt', 'a', 'Kl', 'raly', 'kis', 'asz', 'szony', 'PAUSE', ... 

'HI', 'a', 'ba', 'MA', 'gyar', 'az', 'ta', 'hogy', 'az', 'EM', 'ber', 'ek', 'SZER', ... 

'et', 'ik', 'a', 'sot', 'PAUSE', 'VI', 'lag', 'ga', 'KEL', 'lett', 'hogy', 'MENJ', 'en', ... 

'STOP', 'EL', 'in', 'dult', 'a', 'Kl', 'csi', 'Kl', 'raly', 'kis', 'asz', 'szony', 'SIR', ... 

'va', 'PAUSE', 's', 'BE', 'ert', 'egy', 'nagy', 'ER\ 'do', 'be', 'STOP', 'ON', ... 

'nan', 'nem', 'is', 'ment', 'TO', 'vabb', 'PAUSE', 'ott', 'elt', 'egy', 'DA', 'rab', 'ig', ... 

'EDY', 'ma', 'ga', 'ban', 'STOP', 'EDY', 'szer', 'Ml', 'kor', 'mar', 'egy', ... 

'ESZT', 'en', 'do', 'is', 'el', 'telt', 'PAUSE', 'ar', 'ra', 'JAR', 'ta', 'SZOM', 'szed', ... 

'Kl', 'raly', 'fi', 'PAUSE', 's', 'MEG', 'lat', 'ta', 'a', 'Kl', 'raly', 'kis', 'asz', 'szonyt', ... 
'STOP', 'MEG', 'tet', 'szett', 'a', 'Kl', 'raly', 'fi', 'nak', 'a', 'Kl', 'raly', ... 

'kis', 'asz', 'szony', 'PAUSE', 'mert', 'A', ’kar’, 'mi', 'lyen', 'PISZ', 'kos', ... 

'volt', 'a', 'RU', 'ha', 'ja', 'PAUSE', 'szep', 'volt', 'KU', 'Ion', 'os', 'en', 'az', 'AR', ... 

'ca\ 'STOP', 'SZEP', 'en', 'MEG', 'fog', 'ta', 'a', 'KEZ', 'et', 'PAUSE', ... 

'HA', 'za', 'vez', 'et', 'te', 'a', 'PA', 'lot', 'a', 'ja', 'ba', 'PAUSE', 's', 'ket', 'HET', ... 

'et', 'sem', 'vart', 'PAUSE', 'de', 'meg', 'EDY', 'et', 'sem', 'de', 'TAL', ... 

'an', 'meg', 'egy', 'OR', 'at', 'sem', 'PAUSE', 'es' , 'MEG', 'es', 'kud', 'tek', 'STOP', ... 
'A','FI','at','ar,'par','BE','kes','en','elt','PAUSE','ugy','SZER','et','tek', ... 

'EGY', 'mast','PAUSE','mint','ket','GA','lamb','STOP','EGY','szer','azt','MOND' ... 
,'ta','a','KI','raly','STOP','No','PAUSE','FE','le','seg','PAUSE','A','mi','kor','EL', ... 

'o','szor','MEG','lat','ta','lak','PAUSE','nem','KER','dez','tern','PAUSE','Ml', ... 
'ert','KER','get','ett','el','az','AP','ad','STOP','MONDD','meg','NEK','em','a','VA' ... 

,'16',’sag','ot','STOP','Azt','KER','dez','te','TOL','em','PAUSE','hogy','SZER', ... 

'et','em','ot','PAUSE','s','en','azt','FEL','el','tern','PAUSE','mint','az','EM','ber', ... 
'ek','a','s6t','STOP','J6r,'van','PAUSE','majd','CSr,'nar,'ok','en','VA','la','mit', ... 
'PAUSE','TUD'/om','PAUSE','MEG','szeret','UJ','ra','az','ED','es','AP','ad', ... 
'PAUSE','MOND','ta','a','Kl','raly','STOP','s','AZ','zar,'LE','ver,'et','lrt','az', ... 
'0','reg','Kr,'raly','nak','PAUSE','s','AB','ban','MEG','hfvta','EB','ed','re', ... 
'STOP','Er,'is','ment','a','LE','ver,'MAS','nap','PAUSE','s','HAR','mad','nap'... 
,'j6tt','a','KI','raly','STOP','F0L','vez','et','te','a','FI','at','al','KI','raly','az','0', ... 
'reg','Kr,'ralyt','a','PA','lo','ta','ba','STOP','Ott','mar','meg','volt','TER','lt','ve', ... 
'az','ASZ','tar,'ket','SZEM','ely','re','s','LE','ur,'tek','STOP','No','PAUSE','ez', ... 
'volt','csak','az','EB','ed','STOP','MEG','k6s','tor,'ta','az','0','reg','KI','raly','a', ... 
'LE','vest','PAUSE','de','le','is','TET','te','MIND','jart','a','KAN','al','at','PAUSE' ... 

,'nem','TUD','ta','MEG','en'.'ni','PAUSE','OLY','an','SOT','Ian','volt','STOP', ... 
'GON','dol','ta','MA','ga','ban',’az','O','reg','KI','raly','STOP','EB','bol','Bl', ... 
'zony','KI','fel','ej','tet','tek','a','s6t','PAUSE','de','a','T0B','bi','ET','el','ben', ... 

'majd','csak','lesz','STOP','De','nem','volt','AZ','ok','ban','sem','STOP','HORD', ... 
'tak','a','PE','cseny','ek','et','PAUSE','de','VIS','sza','is','VI','het','tek', ... 

'PAUSE','mert','az','O','reg','KI','raly','BE','le','sem','HA','rap','ott','PAUSE', ... 

'OLY','an','SOT','Ian','PAUSE','IZ','et','len','volt','mind','STOP','De','ezt','mar', ... 
'nem','HAGY','ta','sz6','NEL','kul','STOP','HAL','lod','e','PAUSE','OCS','em', ... 
'PAUSE','hat','MILY','en','SZA','kacs','od','van','NEK','ed\'PAUSE','hogy','so', ... 
'NEL','kur,'SUT','foz','PAUSE','KER','dez','te','STOP','SO','val','SUT','foz', ... 



28 


'ez','MAS','kor','MIND','ig','PAUSE','de'.'en'/azt','HAL','lot','tarn','PAUSE', ... 
'hogy','BATY','am','UR','am','nem','SZER','et','i','a','s6t','PAUSE','azt','MOND', ... 
'tam','hat','NEK','i','PAUSE','hogy','ne','TEGY','en','s6t','az','ET','er,'ek','be', ... 
'STOP','No','PAUSE','azt','ROS','szul','TET','ted','PAUSE','mert','en','NAGY', ... 
'on','SZER','et','em','a','s6t','STOP','Kr,'tor,'HAL','lot','tad','PAUSE','hogy', ... 
'nem','SZER','et','em','STOP','a','LE','any','a','t6r,'PAUSE','MOND','ta','a','Fr, ... 
'at','al','KI','raly','STOP','AB','ban','a','PIL','la','nat','ban','KINY','ilt','az','AJ', ... 
't6','PAUSE','es','BE','lep','ett','a','Kr,'raly','ne','PAUSE','az','0','reg','Kr,'raly', ... 
'LEG','kis','ebb','LE','any','a','STOP','Hej','PAUSE','IS','ten','em','PAUSE', ... 
'0R','ult','az','0','reg','KI','raly','STOP','Mert','SAJ','nal','ta','mar','NAGY','on', ... 
'PAUSE','hogy','EL','ker','get','te','a','LE','any','at','STOP','Most','NEK','i', ... 
'ad','ta','a','LEG','nagy','obb','OR','szag','at','STOP','Meg','ma','is','EL','nek', ... 
'PAUSE','ha','meg','nem','HAL','tak','STOP'); 


Table 4: SALT complete 

size (P) 1069 6 (number and length of syllables) 

size (words) 1 366 


No. 

LEFT 


G 

RIGHT 

1 

PAUSE, STOP, 2-a, ba, ban, 2-be, 
ca, em, en, et, 2-etsz, hez, kill, lamb, 
lesz, lot, ma, mind, ni, ot, re, 7-raly, 
sem, szony, szonyt, szag, 2-sot, sot, 
te, 2-tek, ti, volt, ad, any, 2-at, ed 

50 

STOP 

A, AB, Azt, 2-De, EB, EDY, EGY, 
EL, El, FEL, FOL, GON, HA, 

HAL, HI, HORD, Hej, Jol, KI, 2- 
MEG, MOND, MONDD, Mert, 

Most, Meg, 3-No, ON, Ott, STOP, 
SZEP, SO, a, az, azt, ez, hat, ki, 
mint, mit, ne, no, s, en, ugy 

2 

STOP, a, ez, kos, lan, len, meg, nem, 
sem, szep 

10 

volt 

AZ, 2-EDY, KU, PAUSE, STOP, 
TER, a, csak, mind 

3 

STOP, ig, meg, ott, rom, ta, 2-volt 

8 

EDY 

egy, et, 2-for, ma, 3-szer 

4 

3-EDY, EGY 

4 

szer 

MI, a, azt, egy 

5 

EDY, mar, meg, szer, elt, ert 

6 

egy 

DA, ESZT, OR, nagy, OR, O 

6 

8-az, egy 

9 

0 

9-reg 

7 

9-0 

9 

reg 

9-KI 

8 

STOP, 12-a, 2-al, 2-csi, 9-reg, szed, 
zony 

28 

KI 

2-csi, fel, 23-raly, ralyt, tol 

9 

23-KI 

23 

raly 

BE, LEG, PAUSE, 7-STOP, SZE, 

2-a, az, 2-fi, 5-kis, nak, ne 

10 

Hej, 3-No, at, ba, e, ed, 3-em, et, fi, 
foz, gem, i, ig, is, ja, 2-lak, lan, lek, 
ma, mit, mast, nak, nap, nek, ni, ne, 
3-om, on, ot, ott, raly, sem, 2-szony, 
seg, 4-sot, tad, tam, 2-te, ted, telt, 2- 
tem, tek, to, tol, va, van, vest, volt, 
vabb, vart, ad, 2-am, 2-at, elt, em, et, 
ot 

72 

PAUSE 

3-A, FE, FEL, FOR, FOR, HA, HI, 

IS, 3-KER, LEG, MEG, MI, 3- 
MOND, 2-OLY, STOP, TUD, VI, 
ar, az, 2-azt, 5-de, ez, 2-ha, 7-hogy, 
hat, majd, 3-mert, mind, 3-mint, 2- 
nem, ott, 7-s, szep, 2-ED, IZ, OCS, 
OR, 2-es, 2-ugy 

1 

7-PAUSE, STOP, re 

9 

s 

AB, AZ, BE, HAR, HA, LE, MEG, 
ket, en 

2 

3-a, mert, nines, s 

6 

hA 

6-rom 

3 

6-HA 

6 

rom 

EDY, 2-LE, 2-OR, szep 




4 

PAUSE, rom 

2 

szep 

LE, volt 

5 

6-a, ebb, 2-es, 2-rom, s, szep, zal 

14 

LE 

vel, vest, vel, 10-any, ill 

6 

10-LE 

10 

any 

STOP, 3-a, 2-om, 2-a, 2-at 

7 

STOP, 2-ban, de, 2-dult, 2-ek, 2-em, 
ett, i, ik, ja, jart, jott, lamb, ment, 2- 
mind, mint, nak, ra, 2-raly, ralyt, 
szer, szett, szag, sirt, 7-ta, 6-te, tak, 
tek, volt, 3-any, ugy 

50 

a 

2-FI, GA, 3-HA, KAN, KEZ, 12- 
KI, KOZ, 6-LE, 4-LEG, 2-PA, PE, 
PIL, RU, 2-STOP, SZEL, TISZ, 

TOB, VA, i, 6-sot, volt 

8 

PAUSE, STOP, ban, csak, el, en, 2- 
hogy, ki, mert, mint, meg, ra, raly, 
sot, ta, ve, lit, lit, ult 

20 

az 

AJ, AP, AR, ASZ, EB, 3-EM, OR, 
UD, ED, ET, 8-0 

9 

raly 

1 

SZE 

ret 

20 

SZE 

1 

ret 

te 

1 

TET, 4-dez, el, 2-et, get, hat, ret, 
szelsz 

12 

te 

MIND, 2-PAUSE, STOP, TOL, 

VOL, 6-a 

2 

lett, te 

2 

VOL 

2-na 

3 

2-VOL 

2 

na 

NE, mind 

4 

PAUSE, na, volt 

3 

mind 

STOP, 2-a 

5 

BUZ, 2-szag, OR, 2-any 

6 

at 

FERJ, 2-PAUSE, 2-STOP, sem 

6 

at 

1 

FERJ 

hez 

7 

FERJ, ik 

2 

hez 

AD, STOP 

8 

hez, nak 

2 

AD 

ja, ni 

9 

AD, en 

2 

ni 

PAUSE, STOP 

30 

PAUSE, STOP, foz 

3 

ez 

MAS, nem, volt 

1 

De, HA, 2-PAUSE, am, ez, hogy, 
meg, mar, nan 

10 

nem 

A, HAGY, HAL, KER, 2-SZER, 
TUD, 2-is, volt 

2 

El, bol, do, le, ma, 2-nem, sza 

8 

is 

PAUSE, TET, VI, el, lett, 2-ment, 

EL 

3 

KEL, is 

2 

lett 

VOL, hogy 

4 

na 

1 

NE 

hez 

5 

NE 

1 

hez 

mert 

6 

3-PAUSE, hez 

4 

mert 

A, HA, az, en 

7 

az, egy, obb, 2-rom, szebb 

6 

OR 

6-szag 

8 

6-OR 

6 

szag 

STOP, a, om, sem, 2-at 

9 

2-H1, lot, 2-any 

5 

a 

2-ba, ja, ra, tol 

40 

ar, UJ, a 

3 

ra 

JUT, JAR, az 

1 

ra 

1 

JUT 

ott 

2 

JUT, PAUSE, rap 

3 

ott 

EDY, PAUSE, elt 

3 

PAUSE, STOP, sem 

3 

HA 

nem, rap, za 

4 

3-PAUSE, STOP, mert, nem 

6 

A 

FI, 2-hogy, kar, mely, mi 

5 

2-A, 7-PAUSE, jad, lett, nak, ta 

13 

hogy 

AN, BATY, EL, MENJ, 3-SZER, 
an, 2-az, ne, nem, so 

6 

2-OLY, hogy 

3 

an 

2-SOT, nines 

7 

an 

1 

nines 

hA 

8 

2-EDY 

2 

for 

2-ma 

9 

AL, EDY, Meg, 2-for 

5 

ma 

AL, PAUSE, STOP, ga, is 

50 

ma 

1 

AL 

ma 

1 

2-PAUSE, STOP, en 

4 

ugy 

SZER, a, 2-ED 

2 

ban, 2-et, le, szag, at 

6 

sem 

HA, PAUSE, STOP, de, volt, vart 

3 

2-PAUSE, STOP, szer, 2-en 

6 

azt 

FEL, HAL, 3-MOND, ROS 

4 

3-PAUSE, STOP, 3-azt 

7 

MOND 

jad, 5-ta, tam 

5 

HAGY, JAR, 5-MOND, TISZ, TUD, 
ad, dol, fog, 2-lat, nal, tol, az 

17 

ta 

BUZ, EDY, MA, MEG, SZOM, 7- 
a, az, hogy, lak, mar, szo 

6 

LEG, 2-NEK, a, et 

5 

i 

PAUSE, a, ad, do, nak 




30 


7 

AN, fi, i, raly 

4 

nak 

AD, PAUSE, a, hogy 

8 

hogy 

1 

AN 

nak 

9 

AD, ha 

2 

ja 

PAUSE, a 

60 

PAUSE, 4-a, raly 

6 

LEG 

i, job, 2-kis, nagy, szebb 

1 

LEG 

1 

szebb 

OR 

2 

A 

1 

mely 

ik 

3 

ebb, et, mely, sebb 

4 

ik 

a, et, hez, ot 

4 

ik 

1 

ot 

PAUSE 

5 

LEG 

1 

job 

ban 

6 

2-AB, NYAr, 2-ga, job, nat, ok 

8 

ban 

KINY, MEG, STOP, SZER, 2-a, az, 

sem 

7 

ban, ek, 3-hogy, 2-nem, on, re, ugy 

10 

SZER 

e, 6-et, 3-etsz 

8 

SZER, lod 

2 

e 

PAUSE, ti 

9 

e 

1 

ti 

STOP 

70 

PAUSE, STOP, azt 

3 

FEL 

2-el, elj 

1 

FEL 

1 

elj 

NEK 

2 

Most, elj, hat, meg, van 

5 

NEK 

ed, 2-em, 2-i 

3 

2-NEK, TOL, 3-et, ten 

7 

em 

3-PAUSE, STOP, 2-a, ot 

4 

2-PAUSE, az, 2-ugy 

5 

ED 

5-es 

5 

MEG, 5-ED 

6 

es 

3-AP, 2-LE, kiid 

6 

TUD, szag, var, 2-any 

5 

om 

3-PAUSE, 2-bol 

7 

3-SZER 

3 

etsz 

EN, 2-STOP 

8 

csak, etsz 

2 

EN 

gem, nyi 

9 

EN 

1 

gem 

PAUSE 

80 

Azt, 3-PAUSE, ged, nem 

6 

KER 

6-dez 

1 

6-KER 

6 

dez 

lek, 4-te, tem 

2 

ER, en, i 

3 

do 

be, is, sebb 

3 

do 

1 

sebb 

ik 

4 

EDY, HET, 6-SZER, ik, vel, 2-vez, 

IZ, ek 

14 

et 

PAUSE, STOP, 3-em, i, ik, len, 2- 
sem, 2-te, tek, irt 

5 

3-PAUSE, STOP 

4 

mint 

FOR, a, az, ket 

6 

a, ket 

2 

GA 

2-lamb 

7 

2-GA 

2 

lamb 

STOP, a 

8 

a 

1 

TISZ 

ta 

9 

ta 

1 

BUZ 

at 

90 

PAUSE, STOP, tam 

3 

hat 

MILY, NEK, te 

1 

a 

1 

KOZ 

ep 

2 

KOZ 

1 

ep 

sot 

3 

ep 

1 

sot 

STOP 

4 

STOP, de, mert, ok, s 

5 

en 

NAGY, VA, 2-azt, ligy 

5 

az, 3-es 

4 

AP 

2-ad, 2-am 

6 

2-AP, BATY 

3 

am 

2-PAUSE, UR 

7 

PAUSE, mint 

2 

FOR 

dult, ro 

8 

FOR 

1 

ro 

nyAr 

9 

ro 

1 

NYAr 

ban 

100 

a 

1 

SZEL 

lot 

1 

SZEL 

1 

lot 

STOP 

2 

STOP 

1 

no 

most 

3 

no 

1 

most 

TE 

4 

most 

1 

TE 

ged 

5 

TE 

1 

ged 

KER 

6 

dez 

1 

lek 

PAUSE 

7 

FOR, in 

2 

dull 

2-a 

8 

2-LEG, 5-raly 

7 

kis 

5-asz, 2-ebb 




9 

2-kis 

2 

ebb 

LE, ik 

110 

MOND 

1 

jad 

hogy 

1 

3-az 

3 

EM 

3-ber 

2 

3-EM 

3 

ber 

3-ek 

3 

3-ber, el 

4 

ek 

SZER, 2-a, be 

4 

6-a, en 

7 

sot 

4-PAUSE, 2-STOP, az 

5 

2-FEL, ett, is, 2-ET 

6 

el 

az, ben, ek, te, telt, tem 

6 

2-KI 

2 

csi 

2-KI 

7 

5-kis 

5 

asz 

4-szony, szonyt 

8 

4-asz 

4 

szony 

2-PAUSE, STOP, SIR 

9 

STOP, la 

2 

mit 

BE, PAUSE 

120 

mit, raly, s, es 

4 

BE 

le, lep, szelsz, ert 

1 

BE 

1 

szelsz 

te 

2 

PAUSE 

1 

FOR 

medt 

3 

FOR 

1 

medt 

ra 

4 

medt 

1 

ra 

a 

5 

STOP 

1 

ki 

az 

6 

az 

1 

UD 

var 

7 

UD 

1 

var 

om 

8 

2-om 

2 

bol 

de, is 

9 

5-PAUSE, bol, sem 

7 

de 

TAL, VIS, a, le, 2-meg, en 

130 

2-de, an 

3 

meg 

EDY, az, egy 

1 

STOP, hogy 

2 

ne 

IS, TEGY 

2 

PAUSE, ne 

2 

IS 

las, ten 

3 

IS 

1 

las 

sa 

4 

las 

1 

sa 

lak 

5 

sa, ta 

2 

lak 

2-PAUSE 

6 

2-PAUSE 

2 

ha 

csak, meg 

7 

ha, majd, volt 

3 

csak 

EN, az, lesz 

8 

EN 

1 

nyi 

re 

9 

nyi, ed, ely 

3 

re 

STOP, SZER, s 

140 

PAUSE, STOP 

2 

HI 

2-a 

1 

ja, ta, 2-a 

4 

ba 

MA, PAUSE, STOP, sirt 

2 

ba 

1 

sirt 

a 

3 

ba, ta 

2 

MA 

gyar, ga 

4 

MA 

1 

gyar 

az 

5 

gyar 

1 

az 

ta 

6 

PAUSE, is 

2 

VI 

het, lag 

7 

VI 

1 

lag 

ga 

8 

MA, lag, ma 

3 

ga 

KEL, 2-ban 

9 

ga 

1 

KEL 

lett 

150 

hogy 

1 

MENJ 

en 

1 

ESZT, MEG, MENJ, MILY, SZEP, 
TEGY, kes, os 

8 

en 

MEG, STOP, SZA, az, do, ni, sot, 
elt 

2 

STOP, hogy, kor 

3 

EL 

in, ker, o 

3 

EL 

1 

in 

dult 

4 

szony 

1 

SIR 

va 

5 

SIR 

1 

va 

PAUSE 

6 

BE, MI 

2 

ert 

KER, egy 

7 

LEG, egy 

2 

nagy 

ER, obb 

8 

nagy 

1 

ER 

do 

9 

do, ek 

2 

be 

2-STOP 

160 

STOP 

1 

ON 

nan 

1 

ON 

1 

nan 

nem 




TO _ 

en, ott _ 

egy _ 

DA _ 

MIND, rab _ 

PAUSE, szer _ 

MI, MAS, mi _ 

Ott, ezt, kor, ta _ 

egy _ 

el _ 

PAUSE _ 

ra 

ta 

SZOM 

2-raly 

PAUSE, 2-STOP, ban, en, s, szor, 

es _ 

2-MEG _ 

asz 

MEG, ej 

tet _ 

_A_ 

A, kar 


AR _ 

STOP 

MEG 

a 

KEZ 

HA _ 

FOL, za 
2-a 

2-HAL, PA 
a 

mint, s, tal 
ket 


egy 

2-PAUSE 

es 

kud, iil 
A. 2-a 


ment TO, a 

TO _ vabb _ 

vabb PAUSE _ 

elt _ PAUSE, egy _ 

DA rab 

rab _ig_ 

Jg_ EDY, PAUSE _ 

MI kor, ert 

kor EL, MIND, mar 

mar _ NAGY, egy, meg, nem _ 

ESZT en 

telt "PAUSE 

ar ra 

JAR ta 

SZOM ~szed 

szed _KI_ 

JI_ PAUSE, nak _ 

MEG en, es, fog, hivta, kos, 2-lat, szeret, 

_tet_ 

lat 2-ta 

szonyt STOP _ 

tet szett, tek 


kor, lyen 
PISZ 

kos _ 

volt 

ha 


et _ 

PAUSE _ 

vez _ 

2-et 
lo, lot 
tad, tam, a 
ba 

GA, HET, SZEM 
et 

PAUSE 

an 

meg _ 

at 

BE, MEG 
tek 

2- STOP 

3- a 





5 

3-FI, al 

4 

at 

PAUSE, 3-al 

6 

KAN, 3-at 

4 

al 

2-KI, at, par 

7 

al 

1 

par 

BE 

8 

par 

1 

BE 

kes 

9 

BE 

1 

kes 

en 

220 

et, het, tet 

3 

tek 

EGY, PAUSE, a 

1 

STOP, tek 

2 

EGY 

mast, szer 

2 

EGY 

1 

mast 

PAUSE 

3 

3-STOP 

3 

No 

3-PAUSE 

4 

PAUSE 

1 

FE 

le 

5 

BE, FE, de 

3 

le 

is, sem, seg 

6 

le 

1 

seg 

PAUSE 

7 

EL 

1 

0 

szor 

8 

0 

1 

szor 

MEG 

9 

dez, el 

2 

tem 

2-PAUSE 

230 

ert 

1 

KER 

get 

1 

KER, ker 

2 

get 

ett, te 

2 

get, lep 

2 

ett 

a, el 

3 

2-AP 

2 

ad 

PAUSE, STOP 

4 

STOP 

1 

MONDD 

meg 

5 

MONDD, ha, mar 

3 

meg 

NEK, nem, volt 

6 

a, en 

2 

VA 

la, 16 

7 

VA 

1 

16 

sag 

8 

16 

1 

sag 

ot 

9 

em, sag 

2 

ot 

PAUSE, STOP 

240 

STOP 

1 

Azt 

KER 

1 

te 

1 

TOL 

em 

2 

STOP 

1 

Jol 

van 

3 

Jol, od 

2 

van 

NEK, PAUSE 

4 

PAUSE, ben 

2 

majd 

CSI, csak 

5 

majd 

1 

CSI 

nal 

6 

CSI, SAJ 

2 

nal 

ok, ta 

7 

AZ, nal 

2 

ok 

ban, en 

8 

PIL, VA 

2 

la 

mit, nat 

9 

PAUSE, nem 

2 

TUD 

om, ta 

250 

MEG 

1 

szeret 

UJ 

1 

szeret 

1 

UJ 

ra 

2 

s, volt 

2 

AZ 

ok, zal 

3 

AZ 

1 

zal 

LE 

4 

LE 

1 

vel 

et 

5 

et 

1 

irt 

az 

6 

STOP, s 

2 

AB 

2-ban 

7 

MEG 

1 

hlvta 

EB 

8 

STOP, az, hlvta 

3 

EB 

bol, 2-ed 

9 

2-EB 

2 

ed 

STOP, re 

260 

STOP 

1 

El 

is 

1 

LE 

1 

vel 

mAs 

2 

ez, vel 

2 

mAs 

kor, nap 

3 

MAS, mad 

2 

nap 

PAUSE, jott 

4 

s 

1 

HAR 

mad 

5 

HAR 

1 

mad 

nap 

6 

nap 

1 

jbtt 

a 

7 

STOP 

1 

FOL 

vez 




K1 

PA 

lo 

STOP 

volt 

TER 


ASZ 

ket 

SZEM 
LE 
MEG 
KI, kos 
LE 

is, szul 
kor, te 
MIND 
a 

2-PAUSE 

2-an 

2-SOT 

STOP 

GON 

EB 

bol _ 

_BI_ 

KI 

fel_ 

a 

TOB 
az, bi 

el _ 

csak 

2-STOP 

STOP 

HORD 


cseny 

de _ 

VIS 

VI _ 

HA 

PAUSE 

et _ 

De 


szo, so _ 

2-NEL _ 

STOP, azt, nem, tol 
HAL 


1 TER 

1 it_ 

1 ve 
1 ASZ 
1 tal 
1 SZEM 
1 ely 
1 ill_ 

1 kos 

2 tol 

1 vest 

2 TET 

2 MIND 
1 jart 
1 KAN 


1 | zony 


1 ben 

1 lesz 

2 De 

1 HORD 
1 tak 
1 PE 
1 cseny 


1 rap 
1 IZ 
1 len 
1 ezt 
1 HAGY~ 

1 szo 

2 NEL 
2 kill 
~4 HAL 

1 lod 


al 

2-an 

2-lan 

PAUSE, volt 
dol 



_S!_ 

tet 

_bi_ 

ET 

2-el 

majd 

STOP 

ezt, nem 

tak 


NEL _ 

2-kill 

STOP, SUT ~ 
lod, 2-lot, tak 




35 


1 

PAUSE 

1 

OCS 

em 

2 

OCS 

1 

cm 

PAUSE 

3 

hat 

1 

MILY 

en 

4 

en 

1 

SZA 

kacs 

5 

SZA 

1 

kacs 

od 

6 

kacs 

1 

od 

van 

7 

NEK 

1 

ed 

PAUSE 

8 

hogy 

1 

SO 

NEL 

9 

kill, val 

2 

SUT 

2-foz 

330 

2-SUT 

2 

foz 

PAUSE, ez 

1 

STOP 

1 

SO 

val 

2 

SO 

1 

val 

SUT 

3 

MOND, lot 

2 

tam 

PAUSE, hat 

4 

hogy 

1 

bAty 

am 

5 

am 

1 

UR 

am 

6 

UR 

1 

am 

nem 

7 

ne 

1 

TEGY 

en 

8 

azt 

1 

ROS 

szul 

9 

ROS 

1 

szul 

TET 

340 

TET 

1 

ted 

PAUSE 

1 

mar, en 

2 

NAGY 

2-on 

2 

2-NAGY 

2 

on 

PAUSE, SZER 

3 

lot 

1 

tad 

PAUSE 

4 

a 

1 

tol 

PAUSE 

5 

a 

1 

PIL 

la 

6 

la 

1 

nat 

ban 

7 

ban 

1 

KINY 

lit 

8 

K1NY 

1 

lit 

az 

9 

az 

1 

AJ 

to 

350 

AJ 

1 

to 

PAUSE 

1 

BE 

1 

lep 

ett 

2 

raly 

1 

ne 

PAUSE 

3 

STOP 

1 

Hej 

PAUSE 

4 

IS 

1 

ten 

em 

5 

PAUSE 

1 

OR 

iiit 

6 

OR 

1 

Lilt 

az 

7 

STOP 

1 

Mert 

SAJ 

8 

Mert 

1 

SAJ 

nal 

9 

EL 

1 

ker 

get 

360 

STOP 

1 

Most 

NEK 

1 

i 

1 

ad 

ta 

2 

nagy 

1 

obb 

OR 

3 

STOP 

1 

Meg 

ma 

4 

is 

1 

EL 

nek 

5 

EL 

1 

nek 

PAUSE 

6 

HAL 

1 

tak 

STOP 






















































1 


Salt 2: Incremental Extraction of Grammar 

by Simplistic Rules 


Yuri Tarnopolsky 
2005 

Abstract 

This e-paper continues the examination of language as a quasi-molecular system 
from the point of view of a chemist who happens to ask, “What if the words were 
atoms?” Previously, a scheme of incremental language acquisition, based on very 
few and simple chemistry-inspired principles, was described on the example of 
Hungarian folktale A So (Salt). In this e-paper, the principles are further applied 
to a sequence of acquisition steps. The process does not include any numerical 
calculations. The elementary acts of analysis and extraction are regarded as 
binary encounters of quasi-molecules: small linear sequences of “atoms” of 
language negotiate the outcome of the “collision.” A concept of natural 
computing in language evolution and acquisition is discussed. 

KEYWORDS: language acquisition, bootstrapping, pattern theory, chemistry, automatic 
translation, Turing test, Hungarian language, speech generation, semantics, robot-child, 
robot communication, incremental learning, grammar extraction, language acquisition 
device. 



2 



This e-paper is a further continuation of the examination of language as a quasi- 
molecular system from the point of view of a chemist who happens to ask, “What if the 
words were atoms?” For the explanatory and introductory material, as well as the text and 
translation of the Hungarian tale Salt, see SALT [1], where more references could be 
found. The overall intent can be formulated as application of chemical ideas to subjects 
outside chemistry: mind, language, society, and technology. While physicists have been 
doing that with physical ideas for over one hundred years, some chemists are only now 
slowly and timidly coming to the realization that chemistry might carry its own extra¬ 
chemical message. 

The purpose of SALT 2 is to see if there is some bread to SALT [1]. The latter is 
absolutely necessary for understanding SALT 2. In a preliminary fashion, I attempt to test 
the outcome of SALT experiment for any, however early, promise to be used for 
linearization of thought structure. The latter is understood in terms of Pattern Theory as a 
configuration characterized by content, connector, and their quantitative measure [2]. 

One of the main stimuli of this embryonic work is to develop basic principles that 
could represent the process of individual language acquisition by a robot-child, whether 
realistic or not, in all concrete detail, without being overshadowed by mathematical 


3 


equations, graphs, numbers, and collective behavior. This is a typically chemical manner 
of investigation, embodied in the structural chemical equations. Ideally, they represent a 
detailed sequence of all stable and ephemeral states of the reaction in terms of individual 
atoms and bonds in participating molecules. 

A possible application of this framework, if it turns out promising, is robotic 
communication based on grammar and lexicon acquired with minimal assistance and in 
conditions of poverty of stimulus. The robot-child, as the model can be called, has to 
go from infancy to the beginning of maturity when speech is mastered and active 
intentional learning becomes possible. Infants do not do that by reading The Wall Street 
Journal. Although the tenn bootstrapping, used in various meanings, is vague, it seems 
appropriate for the pre-learning mechanism. Its mechanistic and automatic nature 
resonates well with the term Language Acquisition Device (Chomsky). 

While loose ends and questions hanging in the air can be clearly seen in this 
experiment, some excuse is that we all started with baby talk. 

In SALT 2 I use larger fragments of input than in SALT , in order to faster 
accumulate the representation of a larger text, but shorter ones work similarly. 

The problems of semantics are considered here least if all, although some initial 
idea will be put forward: semantics is possibly as presentable in triangles as syntax in 
triplets, i.e., squashed triangles. 

I perfonn some easy operations, like input rewriting and haplology elimination, 
manually, while more cumbersome ones, like generating comprehensive tables of bonds 
and categories (CATS), are done with MATLAB codes (can be sent on request). 

No explicit numerical calculations are involved. 

The MATLAB output needs some simple manipulations with MS Word in order 
to convert it to tables in document fonnat. Macros can be used. 

The following six steps of acquisition are described with a diminishing degree of 
detail. The stressed syllables are capitalized. 



4 


STEP 1 


INPUT 1 


Pl=char ('volt', 'EGY', 'szer', 'egy', 'O', 'reg', 'KI', 'raly', 'PAUSE', 'es', 'HA', 'rom', 'szep', 'LE', 'any', 'a', .. 
'STOP', 'az', 'O', 'reg', 'KI', 'raly', 'SZER','et','te','VOL','na','mind', 'a', 'HA','rom', 'LE', 'any', 'at', 'FERJ', ... 
'hez', 'AD', 'ni', 'STOP', 'ez', 'nem', 'is', 'lett', 'VOL', 'na', TSTE', 'hez', 'inert', 'HA', 'rom', 'OR', 'szag', 'a', ... 
'volt', 'PAUSE', 'mind', 'a', 'HA', 'rom', 'LE', 'any', 'a', 'ra', 'JUT', 'ott', 'EGY', 'egy', 'OR', 'szag', 'STOP'); 


’START’ and ’END’are added to P1: 

Pl=char (’START', ’volt’, ’EGY’,.,'OR', ’szag’, ’STOP’, 'END'); 


output: 

1. Generators and triplets 

Command: ms, dsgn . 

ms (mindset) compiles structure G in which every generator enters only once, 
dsgn (display generators) displays the triplets. 


GENERATOR SPACE 1 


P =72, G =44 (72 partly repeating and 44 different generators in input) , 

The left and right neighbors are preceded by the number of their occurrences. 
The occurrences of central generators are in Column 4. 


G: 

GENERATOR SPACE 1 

1 

2 

3 

4 

5 

No. 

LEFT NEIGHBOR 

G 

No. of 
entries 

RIGHT NEIGHBOR 

1 

START 

START 

1 

volt; 

2 

START; a; 

volt 

2 

EGY; PAUSE; 

3 

ott; volt; 

EGY 

2 

egy; szer; 

4 

EGY; 

szer 

1 

egy; 

5 

EGY; szer; 

egy 

2 

OR; 0; 

6 

az; egy; 

0 

2 

2-reg; 

7 

2-0; 

reg 

2 

2-KI; 





5 


8 

2-reg; 

KI 

2 

2-raly; 

9 

2-KI; 

raly 

2 

1-PAUSE; SZER; 

10 

raly; volt; 

PAUSE 

2 

mind; es; 

11 

PAUSE; 

es 

1 

HA; 

12 

2-a; mert; es; 

hA 

4 

4-rom; 

13 

4-HA; 

rom 

4 

2-LE; OR; szep; 

14 

rom; 

szep 

1 

LE; 

15 

2-rom; szep; 

LE 

3 

3-any; 

16 

3-LE; 

any 

3 

a; a; at; 

17 

2-mind; szag; any; 

a 

4 

2-HA; STOP; volt; 

18 

a; ni; szag; 

STOP 

3 

END; az; ez; 

19 

STOP; 

az 

1 

0; 

20 

raly; 

SZER 

1 

et; 

21 

SZER; 

et 

1 

te; 

22 

et; 

te 

1 

VOL; 

23 

lett; te; 

VOL 

2 

2-na; 

24 

2-VOL; 

na 

2 

NE; mind; 

25 

PAUSE; na; 

mind 

2 

2-a; 

26 

any; 

at 

1 

FERJ; 

27 

at; 

FERJ 

1 

hez; 

28 

FERJ; 

hez 

1 

AD; 

29 

hez; 

AD 

1 

ni; 

30 

AD; 

ni 

1 

STOP; 

31 

STOP; 

ez 

1 

nem; 

32 

ez; 

nem 

1 

is; 

33 

nem; 

is 

1 

lett; 

34 

is; 

lett 

1 

VOL; 

35 

na; 

NE 

1 

hez; 

36 

NE; 

hez 

1 

mert; 

37 

hez; 

mert 

1 

HA; 

38 

egy; rom; 

OR 

2 

2-szag; 

39 

2-OR; 

szag 

2 

STOP; a; 

40 

any; 

a 

1 

ra; 

41 

a; 

ra 

1 

JUT; 

42 

ra; 

JUT 

1 

ott; 

43 

JUT; 

ott 

1 

EGY; 

44 

STOP; 

END 

1 

END 


Extraction of CATS and BONDS implements the following simplistic rules: 


RULE 1 : Adjacency A—B is registered as bond if {A—B } repeats two or 
more times. 

RULE 2 : If {A—B, A—C } or {D—A , E—A }, A is a generator. 


RULE 3 . Haplology is eliminated, 




6 


RULE 4 . If {A — B, A — C }, then A is a RIGHT CAT with domain 
{B,C}. If {B—A, C—A}, then A is a LEFT CAT with domain {B,C}. 

EXAMPLES: 

r 

1. Bonds a—HA and mind — a contain generator a, which is 
encountered also in doublets any — a and a—volt. Therefore, they cannot be 
qualified as very stable blocks, but could remain as background weak bonds. 

Doublets a — HA and a — volt form right category (RIGHT CAT) a — 

{ HA, volt} ; doublets mind—a and any—a fonn left category (LEFT CAT) 

{ mind, any}—a . 

2. In the following bond sequences the middle doublet is removed to eliminate 
haplology: 

{O —reg , reg—KI, KI—raly} A {O —reg , KI—raly} 

{HA—rom , rom—LE , LE—any} A {HA—rom, LE—any}. 

3. KI and raly in INPUT 1 occur twice and only as a doublet. This is why they 
are qualified as a bond KI_raly , see below. The block further becomes a generator. 

Under these circumstances RULE 2 means that CAT is a generator. The 
difference is that RULE 2 is applicable to new input, in which only B and C are known, 
while RULE 4 applies to known generators. More important, RULE 2 can be applied to 
levels below syllables. This difference is subtle and both rules can be combined. 


2. Bonds 


command: cblr (cats, bonds, left, right); it extracts bonds and CATs (categories). 



7 


BONDS 1 

output 

6 

7 

1 

Oreg 

7 

8 

2 

reg KI 

8 

9 

3 

KIraly 

12 

13 

4 

HA rom 

13 

15 

5 

romLE 

15 

16 

6 

LE any 

17 

12 

7 

a_HA 

23 

24 

8 

VOLna 

25 

17 

9 

minda 

38 

39 

10 

ORszag 


► 


BONDS 1 edited 

Oreg 

KIraly 

HArom 

LEany 

VOL na 

ORszag 


Haplology is eliminated. BONDS with PAUSE and STOP are ignored in this 
experiment, although they can be meaningful. 


3. CATS (CATs, categories) 

LEFT CAT has its domain on the left, RIGHT CAT has its domain on the 

right. 


LEFT CATS 1 


domain 

CAT 

1 

START, a 

volt 

2 

ott, volt 

EGY 

3 

EGY, szer 

egy 

4 

az, egy 

0 

5 

raly, volt 

PAUSE 

6 

a, mert, es 

hA 

7 

rom, szep 

LE 

8 

mind, szag, any 

a 

9 

a, ni, szag 

STOP 

10 

lett, te 

VOL 

11 

PAUSE, na 

mind 

12 

egy, rom 

OR 


RIGHT CATS 1 


CAT 

domain 

1 

volt 

EGY, PAUSE 

2 

EGY 

egy, szer 

3 

egy 

OR, O 

4 

raly 

PAUSE, SZER 

5 

PAUSE 

mind, es 

6 

rom 

LE, OR, szep 

7 

any 

a, a, at 

8 

a 

HA, STOP, volt 

9 

STOP 

END, az, ez 

10 

na 

NE, mind 

11 

szag 

STOP, a 





8 


CATS with “mute” generators START, END, and STOP (highlighted) are 
erased in this experiment) if there is only one more generator except the mute one; if 
more (see line 8 in RIGHT CATS 1), the mute one is erased. 


LEFT CATS 1 (edited) 

1 

ott, volt 

EGY 

2 

EGY, szer 

egy 

3 

az, egy 

0 

4 

a, mert, es 

HA 

5 

rom, szep 

LE 

6 

mind,szag, any 

a 

7 

lett, te 

VOL 

8 

egy, rom 

OR 


RIGHT CATS 1 (edited) 

1 

EGY 

egy, szer 

2 

egy 

OR, 0 

3 

rom 

LE, OR, szep 

4 

any 

ci, ci, cit 

5 

a 

HA, , volt 

6 

na 

NE, mind 


This concludes the analysis of OUTPUT from INPUT PI. 

To prepare OUTPUT 1, for the next input P2, PI is rewritten (compacted) into 
PP1 in accordance with the table of BONDS. The final 'END' is removed. 


OUTPUT 1, compacted 

PPl = charfSTART', 'volt', 'EGY', 'szer', 'egy', 'Oreg', 'KIraly', 'PAUSE', 'es', 'HArom', 'szep', 'LEany', 'a', 
'STOP', 'az', 'Oreg', 'KIraly', 'SZER', 'et', 'te', 'VOLna', 'mind', 'a', 'HArom', 'LEany', 'at', 'FERJ',' hez', ... 
'AD', 'ni', 'STOP', 'ez', 'nem', 'is', 'lett', 'VOLna', 'NE', 'hez', 'mert', 'HArom', 'ORszag', 'a', 'volt', 'PAUSE', . 
'mind', 'a', 'HArom', 'LEany', 'a', 'ra', 'JUT', 'ott', 'EGY', 'egy', 'ORszag', 'STOP'); 


After any step of acquisition is completed, the subsequent input cannot be 
perceived on the same terms as the previous one. If some stable BONDS were recorded, 
the next input is perceived in terms of bonded doublets as generators. This seems to be 
the major difference between the statistical analysis of a corpus and the autonomic 
bootstrapping. In the eyes and ears of robot-child, the world gradually takes meaning 




through discovering its regularity. This process can be visualized as the moving 
borderline between the old and the new, as in walking trough darkness with a flashlight. 

Strictly speaking, all BONDS and CATS structures can be remembered, but for the purpose of 
illustration only the edited ones will be kept in memory and transferred to the next STEP. 

Note that addresses in G may change from step to step, but addresses in BONDS 
and CATS always correspond to the current space G. 


STEP 2 


OUTPUT 1 changes the perception of the next INPUT 2 so that the bonded 
syllables are combined into words, whether complete or incomplete. Haplology is 
eliminated. 


INPUT 2 

P2=char( 'HA', 'nem', 'a', 'HOGY', 'an', 'nines', 'HA', 'rom', 'EGY', 'for', 'ma', 'AL','ma', 'PAUSE', 'ugy', 'a', ... 
'HA', 'rom', 'OR', 'szag', 'sem', 'volt', 'EGY', 'for', 'ma', 'STOP', 'azt', 'MOND', 'ta', 'EGY', 'szer', 'a', 'KT,... 
'raly', 'a', 'LE', 'any', 'a', 'i', 'nak','hogy', 'AN', 'nak', 'AD', 'ja', 'a', 'LEG', 'szebb', 'OR', 'szag', 'at', 'PAUSE', ... 
'A', 'mely', 'ik', 'of, 'PAUSE', 'LEG', 'job', 'ban', 'SZER', 'e', 'ti', 'STOP'); 

Compacting the input along BONDS 1: 

P2=char( 'HA', 'nem', 'a', 'HOGY', 'an', 'nines', 'HA', 'rom', 'EGY', 'for', 'ma', 'AL', 'ma', 'PAUSE', 'ugy', 'a', 
'HArom', 'ORszag', 'sem', 'volt', 'EGY', 'for', 'ma', 'STOP', 'azt', 'MOND', 'ta', 'EGY', 'szer', 'a', 'KIraly', 'a', 
'LEany', 'a', 'i', 'nak', 'hogy', ’AN', 'nak', 'AD', 'ja', 'a', 'LEG', 'szebb', 'ORszag', 'at', 'PAUSE',... 
'A', 'mely', 'ik', 'of, 'PAUSE', 'LEG', 'job', 'ban', 'SZER', 'e', 'ti', 'STOP'); 


Next, PP1 and P2 are concatenated: P=strvcat( PP1, P2, 'END'); 


output: 

The complete G TABLE is omitted. 



10 


1 .Bonds 2 

BONDS 2 add up to BONDS 1 


BONDS 2 output 

1 

voltEGY 

2 

EGYfor 

3 

EGY szer 

4 

Oreg Klraly 

5 

HArom LEany 

6 

HArom ORszag 

7 

LEany a 

8 

a HArom 

9 

STOPEND 

10 

mind a 

11 

forma 


BONDS 2 edited; 
strong bonds in bold 

1 

voltEGY 

2 

EGYszer 

3 

Oreg Klraly 

4 

HArom LEany 

5 

HArom ORszag 

6 

LEany a 

7 

a HArom 

8 

mind a 

9 

EGY for ma 


BONDS 2 illustrate the fluid and provisional character of BONDS and the idea 
of equilibrium. 

BONDS 1 include 0_reg and KI_raly. There is a persistent tendency for 
their adjacency, so that until further solidification of bonds there is an equilibrium soup: 

{O , reg , KI, raly , d reg , Klraly , O reg KI raly } 

The quantitative aspect is ignored here. The “weights” in the soup correspond to 
concentrations in chemistry and probabilities in Pattern Theory [2]. The tenn “weight,” 
an artifact of the first neural networks, seems very inappropriate because, unlike 
concentration, probability, and even energy, it is not nonnalized. The comparison with 
neural networks, however, is avoided here. The position of the equilibrium depends on 
the topic, context, and previous history. We tend to speak in larger blocks when the 
subject is familiar and frequent. We might stumble on an unfamiliar terrain. 




11 


The concept of equilibrium cannot be applied to the mind as a whole, where 
equilibrium is continuously shifting because of the aging of memory traces and the 
influx of new traces. In physical sense, is never achieved in any living system. 

Gradually the representation becomes more and more coarse as the bonds and 
categories solidify and the atomic entries become tagged by their categories. This process 
of tagging is nothing mysterious: a generator is in equilibrium with all its neighbors and a 
category is the neighbor of its entire domain. Categories overlay star topology on tree 
topology. 

Distinctively, the perception also becomes coarser: from sounds to phonemes to 
syllables to morphemes to words to expressions. 

The idea of local equilibrium that I am trying to convey, not for the first time, but 
still with a great difficulty, is very simple in chemistry. The best way is just to look into 
the textbook of general chemistry, although the illustrative material is scattered all over 
the course. A chemical substance is always in equilibrium with all its possible fragments, 
down to the atoms, but the concentration of all or absolute majority of fragments or 
transposition at certain conditions (regarded “normal” in chemistry, physics, and human 
environment) is practically zero. 

Example: Vinegar is acetic acid CffiCOOH in water. It is in equilibrium with its two 
fragments: CH 3 COOH ^ CII,COO“ + H + (it is H + that tastes sour, whatever its 
origin). Theoretically, there could be equilibriums along all bonds, for example, 
CH 3 COOH^ CH 3 CO + + OH , which at normal conditions is completely shifted to 
the left. 

Nevertheless, the chemical transfonnations run through such rare, unstable, and 
improbable states. Otherwise, everything that could chemically happen with the atoms of 
our body would happen in an instant. 

In tenns of practical computation, equilibrium means that most of the memory of 
your personal computer is inaccessible at the moment, which only shows how unnatural 
computers are in their accessibility. It is hard not to remark that in the digital age the 



12 


structure of society becomes unnatural if your most intimate social identity tags become 
accessible. 

As a further illustration, recall how difficult it could be sometimes to retrieve a 
name of a person or location. But if we remember its fragment or any link, for example, 
that it is something related to horses, the name comes up: Mr. Rein? Mr. Spur? Mr. Bay? 
Mr. Hay? Mr. Oats, of course! (the idea is borrowed from a short story by Chekhov). This 
demonstrates the difference between the Hopfield network and human memories, although the former has 
the ability of a retrieval by fragment. 

The memory I have in mind does not have addresses in the sense ROM and RAM 
have. The address of the natural memory cell is anything in equilibrium with the content 
of the cell. This can be either less (first letter of the name Oats) or more (“horses”) than 
the cell content. I believe neurophysiology has its own view of the problem, but in 
psychology it has been known since long as association. Note that the behavior of the 
acetic acid in the above example is called dissociation from left to right and association 
from right to left. 


r r 

Bonds HArom—LEany and HArom—ORszag dissociate and associate 
in the same manner as acetic acid. Suppose robot-child with Salt as its only life 

r 

experience hears the word HArom (“three”). The words LEany (“girl”) and 
ORszag (“land”) will immediately activate in its memory. 

2. CATS 2 


LEFT CATS 2 (edited) 

1 

a, sem 

volt 

2 

ott, rom, ta, volt 

EGY 

3 

EGY, szer 

egy 

4 

az, egy 

Oreg 

5 

a, Oreg 

Klraly 

6 

a, mert, es 

HArom 

7 

HArom, a, szep 

LEany 

8 

Klraly, LEany, 

a 



ORszag, ja, mind, nem, 

szer, ugy 


9 

Klraly, ban 

SZER 

10 

lett, te 

VOLna 

11 

LEany, ORszag 

at 

12 

hez, nak 

AD 

13 

HA, ez 

nem 

14 

HArom, egy, szebb 

ORszag 

15 

AL, for 

ma 

















































13 


16 AN, i 


RIGH CATS 2 (edited) 

2 

EGY 

egy, for, szer 

3 

szer 

a, egy 


egy 


Klraly 


8 LEany 


ORszag, Oreg 


SZER, a 


LEany, ORszag, szep 


a, a, at 


9 

a 

HOGY, HArom, Klraly, LEG 
LEany , i, volt 

11 

SZER 

e, et 

12 

VOLna 

NE, mind 

14 

AD 

ja, ni 

15 

nem 

a, is 

16 

ORszag 

, a, sem, at 

17 

nak 

AD, hogy 

18 

LEG 

job, szebb 


The new CATS 2 are in bold type. The previous CATS 1 may still be in 
memory. 


OUTPUT 2 (compacted) 

PP2=char('START', 'volt', 'EGYszer', 'egy', 'Oreg', 'Klraly', 'PAUSE', 'es', 'HArom',... 
'szep', 'LEany', 'a', 'STOP', 'az', 'Oreg', 'Klraly', 'SZER', 'et', 'te', 'VOLna', 'mind', ... 

'a', 'HArom', 'LEany', 'at', 'FERJ', 'hez', 'AD', 'ni', 'STOP', 'ez', 'nem', 'is', 'lett', ... 
'VOLna', TSTE', 'hez', 'mert', 'HArom', 'ORszag', 'a', 'volt', 'PAUSE', 'mind', 'a', ... 
'HArom', 'LEany', 'a', 'ra', 'JUT', 'ott', 'EGY', 'egy', 'ORszag', 'STOP', 'END', 'HA', ... 
'nem', 'a', 'HOGY', 'an', 'nines', 'HA', 'rom', 'EGYforma', 'AL', W, 'PAUSE', ... 

'ugy', 'a', 'HArom', 'ORszag', 'sem', 'volt', 'EGYforma', 'STOP', 'azt', 'MOND', ... 
ta', 'EGYszer', 'a', 'Klraly', 'a', 'LEany', 'a', 'i', 'nak', 'hogy', 'AN', 'nak', 'AD', 'ja', ... 

'a', 'LEG', 'szebb', 'ORszag', 'at', 'PAUSE', 'A', 'mely', 'ik', 'of, 'PAUSE',... 

'LEG', 'job', 'ban', 'SZER', 'e', 'ti', 'STOP'); 


GRAMMAR 2 

Until now we were manipulating syllables in a formal manner, supposedly not 
knowing what they meant, although I felt a constant pressure of meaning. Now we can 
try to tentatively interpret what it all means. Translations are given for words and some 
morphemes/lexemes. 












































14 


Left or right is indicated by letters L and R: EGY L means LEFT CAT ‘EGY’. 


NOTE. This example reminds about the agglutination in Hungarian: 

'LEany', 'a', 'i', 'nak' = “girl,” “his/her,” “Plural,” “Dative”; 
'LEanyainak' = [give] “to his/hers girls” 

Most probably, each of the countless blocks of morphemes in such 
languages as Hungarian, Finnish, Russian, and Turkish, are acquired by children 
as a whole. As a speaker of Russian, however, I must note that sometimes the 
trailing endings need some small but perceptible time to arrange in order before 
saying. I would not be able to use the following word up to, probably, the age of 7 
or 8, and even now I would avoid it by all means: 
npourptiBaiomHecH 

pro-igr-yva-yu-shchi-e-sya (7 morphemes) 

[those that] can be played (playable) [for example, on DVD player] 

or: [those that] are being played now; igr is the stem (“play”) 


Table GRAMMAR 2 is compiled manually from CATS 2. 


GRAMMAR 2 


LEFT 

CAT 

RIGHT 

INTERPRETATION 

LABEL 

1 


EGY R 

“one” 

-egy, -szer 

EGYegy, “one” 
EGYszer “once” 

Semantics 

(“one”) 

EGY R 

2 

-ott, -ta, volt 

Past, “was” 

EGYL 

“one” 


Verb (Past) 

volt 

3 

az, egy 

Oreg 

“old” 


Article 

egy 

H 

a, Oreg 

KIraly 

“king” 


Adjective 

Oreg 

5 


egy 

“a” 

ORszag, 

“land” 

Oreg “old” 

Noun Single 
Indefinite 

ORszag 

6 

HArom, a, 
szep 

“three”, “the”, 

LEany 

“girl” 


Noun group 

(Numeral, 

Adjective) 

szep 







































15 



“beautiful” 





7 


HArom 

“three” 

LEany, ORszag 
szep 

Numeral 

HArom 

8 


LEany 

-a, -a, -at 

Possessive 

-a 

9 

LEany, 

ORszag 

at 


Noun Possessive 
+ Accusative. 

at 

1 

0 

-hez, -nak 

“to”, Dative 

AD 

“give” 


case endings 
Allative, Dative 

-hez 

1 

1 


AD 

-ja, 3 rd Person, 

-ni Infinitive 

Verb 

ADR 

1 

2 

HArom, egy, 
szebb 

“three”, “a”, 
“best” 

ORszag 

“land” 


Noun group 

ORszag 

1 

3 

Alma 

“apple,” 
forma “form” 

(tna) 


(sound) 


1 

4 

AN, -i- 

ANnak “to that 
one” -i- Plural 

-nak 


Dative 

nak 

1 

5 


STOP 

az, azt, ez 

“the”, “this” 

Article, Pronoun, 
(Sentence start) 

STOPR 

1 

6 


SZER 

-e, -et 

SZEReti “loves” 
SZERet "loved” 

3 rd Person 

See comments to 
STEP 4 

SZERR 

1 

7 


LEG 

-job, -szebb 

(LEGszebb 
“best”) 

Superlative 

LEG 


IMPORTANT: Interpretation relates to the domain of CAT, not to the CAT itself. 

I hope the GRAMMAR 2 table speaks for itself, very much like an infant, as it is 
supposed to. What I see in it is the very beginning of crystallization of individual 
grammar in the individual mind of the robot-child who has never heard anything but the 
story of Salt. If Salt is its only experience, semantics is absent. 

But how are categories labeled in the mind of an infant? Certainly not by terms of 
grammar. Of course, the category is just a generator and it is labeled just by its 
individuality of a set member. But some first words that enter CATS may contribute 
themselves as internal labels for patterns of grammar. This is reasonable in case of 
LEFT CATS, but for RIGHT CATS the name of the CAT can be taken as label. I wonder 




16 


if this is because Hungarian is left-branching. Will that be different in Spanish? To speculate, the 
first impressions of robot-child imprint large subsequent categories of whether 
syntax or semantics (I begin to suspect that their opposition may be as useful but as artificial as the 
concept of syllable). Could this work for a real child? It is quite probable that the first 
impressions of the child imprint internal labels for large natural categories of light, 
darkness, comfort, discomfort, hunger, satisfaction, fear, and joy, from which the tree of 
knowledge grows. Ulf Grenander describes in Patterns of Thought [2] the outer branches 
of the tree. 


STEP 3 


In subsequent STEPS only compacted new inputs will be shown. 

INPUT 3 

P3=char('FEL', 'elj', 'NEK', 'em', 'PAUSE', 'ED', 'es', 'LE', 'any', 'om', 'PAUSE', 'hogy', 'SZER', 'etsz',... 
'EN', 'gem', 'PAUSE', 'KER', 'dez', 'te', 'a', 'LEG', 'i', 'do', 'sebb', 'ik', 'et', 'STOP', 'mint', 'a', 'GA', 'lamb', 
'a', 'TISZ', 'ta', 'BUZ', 'at', 'PAUSE', 'MOND', 'ta', 'a', 'LE', 'any', 'STOP', 'hat', 'te', 'PAUSE', 'ED', 'es', ... 
'LE', 'any', 'om', 'PAUSE', ’KER’, ’dez’, ’te’, ’a’, ’KOZ', ’ep’, 'sot', ’STOP’, ’en’, ’ugy’, ’ED’,... 
'es', ’AP’, 'am', 'PAUSE', 'mint', 'FOR', 'ro', 'NYAR', 'ban', 'a', 'SZEL', 'lot', 'STOP'); 


No compacting is needed 
P =183 G =96 


BONDS 3 and CATS 3 (Strong bonds are in bold type) 


BONDS 3 edited 


a-LEany 

Oreg-KIraly 


te-a 

HArom-LEany 


rnind-a 

HArom-ORszag 


MOND-ta 

LEany-a 


ED-es 

LEany-om 


es-LEany 

a-HArom 


KER-dez 

a-LEG 


dez-te 





17 


RIGHT CATS 3 


volt 

EGY forma, EGY szer 

PAUSE 

EGY szer 

a, egy 

egy 

ORszag, Oreg 

KIraly 

PAUSE, SZER, a 

PAUSE 

A, KER, LEG, MOND 
hogy, mind, mint, ED, es, ugy 

HArom 

EGY forma, LEany, 

ORszag, szep 

LEany 

STOP, a, om, a, at 

a 

GA, HOGY, HArom, KIraly 

KOZ, LEG, LEany, 

SZEL, TISZ 

a 

i, volt 

STOP 

END, FEL, HA, az, azt, 
ez, hat, mint, en 

SZER 

e, et, etsz 

et 

STOP, te 

te 

PAUSE, VOLna, a 

VOLna 

NE, mind 

at 

FERJ, pause 

AD 

ja, ni 

nem 

a, is 

ORszag 

STOP, a, sem, at 

EGY forma 

AL, STOP 

ugy 

a, ED 

ta 

BUZ, EGYszer, a 

i 

do, nak 

nak 

AD, hogy 

hogy 

AN, SZER 

LEG 

i, job, szebb 

ik 

et, ot 

ban 

SZER, a 

es 

AP, LEany 

mint 

FOR, a 


LEFT CATS 3 

START, a, sem 

volt 

ta, volt 

EGYszer 

EGY, EGYszer 

egy 

az, egy 

Oreg 

a, Oreg 

KIraly 

KIraly, em, gem, ma, om 
te, volt, am, at, ot 

PAUSE 

a, mert, nines, es 

HArom 

HArom, a, es, szep 

LEany 

EGYszer, KIraly, LEany 
ORszag, 

a 

ban, ja, lamb 

mind, mint, nem, ta, te, ugy 

a 

EGYforma, LEany, 

ORszag, a, et, lot, ni, sot, ti 

STOP 

KIraly, ban, hogy 

SZER 

SZER, ik 

et 

dez, et, hat 

te 

lett, te 

VOLna 

PAUSE, VOLna 

mind 

BUZ, LEany, ORszag 

at 

hez, nak 

AD 

HA, ez 

nem 

HArom, egy, szebb 

ORszag 

HArom, volt 

EGYforma 

PAUSE, en 

ugy 

PAUSE, azt 

MOND 

MOND, TISZ 

ta 

LEG, a 

i 

AN, i 

nak 

PAUSE, nak 

hogy 

PAUSE, a 

LEG 

mely, sebb 

ik 

NYAR, job 

ban 

PAUSE, ugy 

ED 

PAUSE, STOP 

mint 


Some Hungarian morphemes, such as a and t, are used as markers in several roles 
across various parts of speech. The red border in the CATS 3 tables encloses an example 
of how this multifunction can be dealt with. The lines of CAT a, both LEFT and RIGHT, 
are split depending on the stress of the syllables in the domain. 

Note that in Hungarian the possessor is unmarked, while the possession is: 

A KIraly ORszaga , “The King land-his” , “The King’s land,” 



































































18 


In LEFT CATS, if a precedes a stressed syllable, a strong hypotheses of robot- 
child is that a is the definite article. If the next after a syllable is unstressed or the word is 
monosyllabic, it is a possession mark: ORszaga volt , “his/her-land was.” 

In RIGHT CATS, if a noun is followed by a, a somewhat weak hypothesis can be 
fonned that a marks a possession: LEanya , “his/her-girl.” Otherwise, it can be the 
definite article. 


BONDS 1-3, edited 


HArom ORszag 

BONDS 2 


LEany a 

MOND ta 


a HArom 

KER dez 


mind a 

ED es 


BONDS 1 

BONDS 2 


6 reg 

EGY for ma 


KI raly 

EGYszer 


HArom 

voltEGY 


LE any 

Oreg KIraly 


VOL na 

HArom LEany 


OR szag 


Blocks of words are not strong bonds. I left them in the BONDS 1-3 table to 
illustrate the semantic and contextual significane of longer blocks: volt_EGYszer 
“there was once,” Oreg—KIraly “old king,” HArom—LEany “three girls,” 

HArom—ORszag “three lands.” The plural of the noun after a numeral is unmarked in 
Hungarian. 
























19 


LEFT CATS 3 (edited) 

ta, volt 

EGY szer 

KIraly, em, gem, ma, 

om 

te, volt, am, at, ot 

PAUSE 

a, mert, nines, es 

HArom 

HArom, a, es, szep 

LEany 

EGYszer, KIraly, LEany 

ORszag, 

a 

ban, ja, lamb 

mind, mint, nem, ta, te, 

«gy 

a 

KIraly, ban, hogy 

SZER 

SZER, ik 

et 

dez, et, hat 

te ??? TE 

lett, te 

VOLna 

BUZ, LEany, ORszag 

at 

hez, nak 

AD 

HA, ez 

nem 

HArom, egy, szebb 

ORszag 

HArom, volt 

EGYforma 

MOND, TISZ 

ta 

LEG, a 

i 

AN, i 

nak 

mely, sebb 

ik 

NYAR, job 

ban 


RIGHT CATS 3 (edited) 

volt 

EGYforma, EGYszer 

EGYszer 

a, egy 

egy 

ORszag, Oreg 

HArom 

EGYforma, LEany, 

ORszag, szep 

LEany 

-a, -om, -a, -at 

a 

GA, HOGY, HArom, KIraly 

KOZ, LEG, LEany, 

SZEL, TISZ 

a 

i, volt 

SZER 

e, et, etsz 

AD 

ja, ni 

nem 

a, is 

ORszag 

a, sem, at 

i 

dd, nak 

nak 

AD, hogy 

LEG 

i, job, szebb 

ik 

et, ot 

ban 

SZER, a 

es 

AP, LEany 

mint 

FOR, a 












































20 


GRAMMAR 3 

STEP 3 does not add much to the grammar. The following are some new or 
expanded old categories: 









LEFT 

CAT 

RIGHT 

INTERPRETATION 

Label 

[ GRAMMAR 2 is here j 

18 

-mely-, -sebb- 

(Amelyik “which,” 
LEGsebbik “which 
is the most beautiful'’ ) 

ik 


Suffix of 
“adjectivity,” 
working as an 
object pronoun 

ik 

19 

HA, ez 
(HAnem 

“however” 
ez nem “this is 
not”) 

nem 

“not, 

no” 


Semantics: 

negativity 

nem 

20 


SZER 

e, et, etsz 

See comments to 
STEP 4 



COMMENTS to STEP 3: 


1.Category { NYAR, job } ban is a wrong hypotheses. NYARban means ”in 
the summer” , but in the phonological LEG-job-ban “best of all” , the adverb 
morpheme is an , not ban, “in”. Morphology requires LEG-jobb-an. The double b is 
a stem variation. 


NOTE. The struggle of morphology and phonetics raises a lot of dust over the 
notion of syllable. This problem has been repeatedly addressed in the literature, 
sometimes in strong words against phonology, but is too technical to discuss here. 
Regarding Hungarian, I wish to refer to the ingenious solution found, as I suspect, 
by people at least partly outside linguistics. Instead of syllables, “half-syllables” 
were used as atoms of speech [3]. Examples: ta-, a-, te-, le-, ke- (first half¬ 
syllable), -a, -e, -i,.... -ol, -el, -in, -ek (second half), etc. 326 half-syllables 




describe 95% of general Hungarian texts. It is, essentially, uses haplology as the 
means of “mutual recognition” of atoms of speech. 


21 


2. PAUSE in the left CAT 3 follows an unstressed syllable. 

3. MONDta “said,” and TISZta “clean” are together for a wrong reason. 


STEP 4 


INPUT 4 

P4=char('no', 'most', 'TE', 'ged', 'KERdez', 'lek', 'PAUSE', 'FOR', 'dult', 'a', 'LEG', 'Ids', 'ebb', 'ik', 'hez',... 
'STOP', 'MOND', 'jad', 'hogy', 'SZER', 'etsz', 'STOP', 'ugy', 'ED', 'es', 'AP', 'am', 'PAUSE', 'a', 'HOGY',.. 
'az', 'EM', 'ber', 'ek', 'a', 'sot', 'PAUSE', 'FEL', 'el', 'te', 'a', ’KI’, 'csi', ’KIraly’, Ids', 'asz', ... 

'szony', 'STOP', 'mit', 'BE', 'szelsz', 'te', 'PAUSE', 'FOR', 'medt', 'ra', 'a', 'KIraly', 'STOP', 'ki', 'az', ... 
'UD', 'var', W, 'bol', 'de', 'meg', 'az', 'ORszag', W, 'bol', 'is', 'PAUSE', 'STOP', 'ne', 'IS', 'las', 'sa', 'lak', 
'PAUSE', 'ha', 'csak', 'EN', 'nyi', Te', 'SZER', 'etsz', 'STOP'); 


P =264 G =136 


BONDS 4 

unedited) 

5 6 

Oreg KIraly 

9 11 

HArom LEany 

9 32 

HArom ORszag 

11 12 

LEany a “his girl” 

11 69 

LEany om, “my girl” 

12 39 

a HOGY, “as” 

12 9 

a HArom 

12 6 

a KIraly 

12 54 

a LEG 

12 11 

a LEany 

15 70 

SZER etsz “you (s.) love” 

17 12 

tea 

19 12 

minda 


51 15 

hogy SZER 

68 11 

EDes LEany “dear girl” 

69 7 

omPAUSE 

69 124 

om bol 
“out of my” 

70 13 

etsz STOP 

73 17 

KERdez te 
“asked” 


87 88 

A Pa m 

“my father” 

88 7 

am PAUSE 


The CATS are omitted at this step. 



























22 


COMMENTS to STEP 4: 

1. In STEP 4 BONDS and CATS add new words: APam , “my father,” suffix — 
bol “out of,” and the personal suffix category of possession -om, “my.” Due to the 
hannony of vowels, the marking morphemes belong to one of two phonetic series. The 
other “my” morpheme is -am as in APam . This is where Rule 2 comes into play: —am 
and — om fonn a phonetic category -m, “my,” which is not syllabic. Harmony of 
vowels requires a separate non-morphemic and non-syllabic category of vowel type. 

2. Line 20 in GRAMMAR 3 is an evidence of confusion that comes from the 
fuzziness of syllable. I created the confusion by resisting the temptation to split 
SZEReti into SZER-et-i, as the morphology required, because SZERet is the stem of 
verb “to love”. There is no such word or stem as szer. There is not enough data for robot- 
child at this step to form the bond SZER—et, for which the rest of the tale will give 
enough evidence. The total count for the entire tale is 11: 

1 SZER-et-te (he) loved 1 SZER-et-ik (he) loves 

2 SZER-e-ti (he) loves 1 SZER-et-tek (they) loved 

3 SZER-etsz (you) love 3 SZER-et-em (I) love 

The syllabic division of SZER-e-ti and SZER-etsz conflict with morphology. 
There is a good chance, however, that the initial hypotheses in STEPS 1 to 4 will be 
replaced by better founded ones. The morpheme SZ , which is not exactly syllabic, will be 
differentiated phonetically. 

Language acquisition certainly starts at phonemic level. I have already noted 
elsewhere that optimality principle of Prince and Smolensky, first developed in phonetics, 
is very chemically-friendly. Paradoxically, it might be easier to translate speech than text. 
This is a very intriguing problem, which will be left until better times, however. 



23 


3. Note the crystallization (highlighted yellow) of the definite article a and nouns 

r 

a_HArom, “the three,” a_KIraly, “the king,” a_LEany, “the girl,” as well as the 
standard block minda , “each/all” + article, which could well be written as one word. 

r 

Further, the verb forms become more solid: KERdez_te , “(he/she) asked.” This poses 
a question: how much strength should we attribute to the bonds? The answer is: I have no 
idea and this is exactly the point of the project. We should make a model and trust robot- 
child to tackle the problem on its own. Until that, my attribution of bold type to strong 
bonds is intuitive, which is not much better than arbitrary. Since I am familiar with the 
meaning of the words, I must be excluded from decision making. I realize that this is a 
truly heretic idea, but definitely not the only heretic idea in the realm of ideas, some of 
them later accepted. To reformulate this idea: I am forbidden to be the homunculus for 
robot-child because I have a mind of my own. 

The last squeak of homunculus: “Should we cut robot-child into Rotchild? “ 


STEP 5 


P5=char('HI\ 'a', 'ba\ 'sirt', 'a', 'Klraly', 'kis', 'asz\ 'szony', 'PAUSE', 'HI', 'a', 'ba', 'MA', 'gyar', 'az', 
'ta', 'hogy', 'az', 'EM', 'ber', 'ek', 'SZER', 'ef, 'ik', 'a', 'sot', 'PAUSE', 'VI', 'lag', 'ga', 'KEL', 'lett',... 
'hogy', 'MENJ', 'en', 'STOP', 'EL', 'in', 'dult', 'a', 'Kl', 'csi', 'Klraly', 'kis', 'asz', 'szony', 'SIR', ... 

■va', 'PAUSE', '6s', 'BE', 'ert', ’egy', 'nagy', 'ER', 'do', 'be', 'STOP', 'ON', 'nan', 'nem', 'is', 'rnent', ... 
’TO’, ’vabb’, ’PAUSE', 'ott', ’elf, ’egy’, ’DA’, 'rab', 'ig', ’EGY', W, 'ga', ’ban’, ’STOP’); 


BONDS 5 (unedited) 


5 6 

Oreg Klraly 

6 99 

KIralykis 

7 73 

PAUSEKERdez 

7 68 

PAUSE EDes 

7 8 

PAUSEes 

9 11 

HArotn LEany 

9 32 

HArotn ORszag 

11 12 

LEany a 

11 69 

LEany otn 

12 39 

a HOGY 

12 9 

a HArotn 

12 110 

a KI 

12 6 

a Klraly 


12 

54 

a LEG 

12 

11 

a LEany 

12 

108 

a sot 

14 

105 

az_EM 

15 

16 

SZER et 

15 

70 

SZER etsz 

j 17 7 

te PAUSE 

17 

12 

te a 

19 

12 

mind a 

! 20 7 

at PAUSE 

26 

27 

nem is 

33 

136 

aba 

51 

15 

hogy SZER 

68 

11 

EDes LEany 




24 


69 

7 

omPAUSE 

69 

123 

om bol 

| 70 13 

etsz STOP 

73 

17 

KERdezte 

| 87 7 

APamPAUSE 

98 

12 

dulta 

99 

112 

kisasz 


105 106 

EMber 

106 107 

ber ek 

108 7 

sotPAUSE 

110 111 

KI csi 

111 6 

csi KIraly 

112 113 

aszszony 

135 33 

HI a 


BONDS 5 


KERdez te 

edited 


kis asz 

KIralykis 


EM ber 

az_EM 


ber ek 

SZER et 


KI csi 

aba 


aszszony 


The reason why I show the numbers of generators in BONDS tables is to illustrate 
the detection of haplology. The numbers follow without interruption: 


14 105 

az EM 

105 106 

EMber 

106 107 

ber ek 


Next CATS will be shown selectively because their interpretation will involve too 
much reference to their meaning. This will be difficult to follow without the knowledge 
of the language. 


RIGHT CATS 5 (illustrative selections) 

RIGHT 

CAT 

Domain 

Interpretation 

HArom 

“three” 

EGYformaLEanyORszagszep 

“equal,” “girl,” “land,” “beautiful” 

Noun or Adjective 

LEany 

a om a at 

Noun suffixes 

a 

“the” 

HArom, KIraly, LEany, “three,” 
“king,” “girl” (Nominative) 
sot ,”salt (Accusative)” 

Noun or numeral 

et 

ik, te 

Verb suffixes 

AD 

ja, ni 

Verb suffixes 

LEG- 

“the most” 

(i), job,kis, szebb 
“good,” “little,” “beautiful” 

Superlative 

ik 

a, et, hez, ot 

A very confused CAT 





25 


EDes 

“dear, sweet” 

APam LEany 

“my father,” “girl” 

Noun 

KERdez 

lekte 

Verb suffixes 


LEFT CATS 5 (selectively) 

Domain 

LEFT CAT 

Interpretation 

MONDta, volt 

“said” “was” 

EGYszer 

“once” 

Standard block 

HArom, a, szep, EDes 

“three”, “the,” “beautiful,” “sweet” 

LEany 

“girl” 

Noun group 

LEany, ORszag, -var 

“girl,” “land,” “court” (second syllable) 

-om 

“my” 

Personal Possession 


Note the following CAT: 


a 

HArom, KIraly, LEany, 


sot 


in which sot is the Object Case (Accusative) of SO “salt.” Should I have capitalized 
monosyllabic nouns? 

Anticipating the next shaky steps of the model, I hope that if SO is going to be 
mentioned in various cases, the problem will be solved somehow. These are some of the 
later entries of SO in the text: 'SOtlan' , “saltless” 'SOval' , “with salt,” as well as 
just SO. But how exactly it is going to be solved, I don’t know. Listening to the sound 
track, I cannot decide whether it is really stressed, although it seems to be so in a sot , 


“the salt (object).” The article a places it in the category of nouns. 

My Russian ear is not used to distinguish between a long and a stressed vowel, 
which again brings us to the obvious idea that linguistics as exact natural science must 
start with phonemes. Chemistry starts with elements. 


BONDS 5 edited 

6 99 


KIraly kis 

14 105 


az EM 

15 16 


SZER et 

33 136 


aba 

73 17 


KERdezte 


99 112 


kisasz 

105 106 


EMber 

106 107 


berek 

110 111 


KI csi 

112 113 


aszszony 


BONDS 5 add words: EM_ber_ek, “people,” KIraly_kis_ as_szony, 
“princess” KERdez_te, “(he/she) asked.” 




26 


STEP 6 


P6=char('EGYszer', 'MI', 'kor', 'mar', 'egy', 'ESZT', 'en', 'do', 'is', 'el', 'telt', 'PAUSE', 'ar', 'ra', 'JAR', 'ta', ... 
'SZOM', 'szed', 'KIraly', Ti', 'PAUSE', 'es', 'MEG', 'laf, 'ta', 'a', 'KIralykisasszonyt','STOP', 'MEG', ... 

'tet', 'szett', 'a', 'KIraly', 'fi', 'nak', 'a', 'KIralykisasszony', 'PAUSE', 'mert', 'A', 'kar', 'mi', 'lyen', 'PISZ', ... 
'kos', 'volt', 'a', 'RU', 'ha', 'ja', 'PAUSE', 'szep', 'volt', 'KU', 'Ion', 'os', 'en', 'az', 'AR', 'ca', 'STOP', 'SZEP', ... 
'en', 'MEG', 'fog', 'ta', 'a', 'KEZ', 'et', 'PAUSE', 'HA', 'za', 'vez', 'et', 'te', 'a', 'PA', 'lot', 'a', 'ja', 'ba', 'PAUSE', 
'es', 'kef, 'HET', 'et', 'sem', 'varf, 'PAUSE', 'de', 'meg', 'EGY', 'et', 'sem', 'de', 'TAL', 'an', 'meg', 'egy', 'OR',... 
'at', 'sem', 'PAUSE', 'es', 'MEG', 'es', 'kud', 'tek', 'STOP', 'END'); 

P =416 G =200 


BONDS 6 


12 

110 

a, sot 

14 

107 

az, EM 

16 

12 

te, a 

18 

12 

mind, a 

25 

26 

nem, is 

50 

61 

hogy, SZER 

61 

70 

SZER, etsz 

68 

11 

EDes, LEany 

69 

123 

om, bol 

73 

12 

KERdezte, a 

76 

45 

et, sem 

107 108 

EM-ber (man) 

107 108 109 

EM-ber-ek (people) 

112 

113 

KIcsi, KIralykisasszony 

124 

125 

de, meg 


BONDS 6 edited 

5 6 

Oreg, KIraly 

6 164 

KIraly-fi (prince) 

8 165 

es, MEG 

9 11 

HArom, LEany 

9 31 

HArom, ORszag 

11 12 

LEany-a (his girl) 

11 69 

LEany-om (my girl) 

12 38 

a, HOGY 

12 9 


a, HArom 

12 112 


a, KIcsi 

12 6 


a, KIraly 

12 113 


a, KIralykisasszony 

12 53 


a, LEG 

12 11 


a, LEany 


We can see the development of an important generalization (highlighted yellow): 
the patterns of the definite article a and the subsequent noun group. The bond, therefore, 
can be completely described at the level of the interpreted CATS: 

[Definite Article]—[Numeral, Adjective, or Noun] 

or, to emphasize the higher level: Article—Noun 

There are also case and possessive suffixes of a noun (highlighted green) and 
some stable expressions mind a (“all of the..each of the...” ) and nem is (“not so 
[bad]”) 


RIGHT CATS 6 edited; selectively 

Interpretation 

volt 

EGYforma, EGYszer, KU, PAUSE, a 


EGYszer 

MI, a, egy 


egy 

DA, ESZT, ORszag, OR, Oreg 









































27 


KIraly 

pause, stop, SZERet, a, 11 


es 

HArom, MEG, ket 


HArom 

EGYforma, LEany, ORszag, szep 


szep 

LEany, volt 


LEany 

a, om, a, at 

Case and possession 
markers of nouns 

az 

(“the”) 

AR, EM, ORszag, UD, Oreg 

Noun starting with 
vowel 

SZERet 

ik, te 

Verb suffixes 

AD 

ja, ni 

Verb suffixes 

nem 

a, is 

Noun suffixes 

ORszag 

a, om, (sem), at 


sem 

(de), volt, vart 

Verb after negation 

MONDta 

EGYszer, a 


LEG 

(I), job, kis, szebb 

Superlative 

SZER 

e, etsz 

Verb person 

FEL 

el, el j 


EDes 

APam, LEany 

Noun in addressing 

el 

te, telt 


MEG (multi¬ 
functional prefix) 

es, fog, lat, tet 

Verb stem after prefix 


LEFT CATS 6 edited; selectively 

Interpretation 

MONDta, volt 

EGYszer 

Verb (Past) 

az, egy 

Oreg 

Articles 

a, szed, Oreg 

KIraly 


a, mert, nines, es 

HArom 

Predecessors of noun group 

HArom, a, szep, EDes 

LEany 

Noun group 

EGYszer, KIraly, KERdezte, 
LEany, MONDta, ORszag, 
ban, dult, ek, ik, ja, 
lamb, mind, mint, nak, nem, 
ra, szett, sirt, ta, te, volt, ugy 

a 

Various words and endings 
requiring definite article; 
nouns among them 

HOGY, en, hogy, ki, meg 

az 

Same as previous 

KIraly, -ek 

SZERet 

Noun 

SZERet, el, et, hat, szelsz 

te 

Past tense 

lett, te 

VOLna 

“would” 

Verb 

Conditional 

BUZ, LEany, ORszag, OR 

at 

Object Case/Accusative 

hez, nak 

AD 

Dative/Allative before a 
verb 

HArom, az, egy, szebb 

ORszag 

Noun group 

HArom, volt 

EGYforma 





28 


SZERet, ebb, mely, sebb 

ik 

A mixed-up cat 

LEany, ORszag, var 

om 


JAR, TISZ, fog, lat, az 

ta 

Past Tense verb 
(predominately) 

KIcsi, a 

KIralykisasszony 



The first of the following two lines from LEFT CATS 6 contains HArom (“three”) as a 
CAT, but the second has HArom in the domain 


a, mert, nines, es 

HArom 

Predecessors of noun group 

HArom, a, szep, EDes 

LEany 

Noun group 


Since LEany (“girl”) is a noun, another high level pattern solidifies: 


Predecessor of noun group—Noun group—Noun—CASE/POSSESSION 


As far as the extracted GRAMMAR is concerned, it has no use until it comes to 
speech generation, which could be the subject of the next part. Obviously, to be applied 
to speech generation, each generator must be tagged by all CATS. This seems like an 
extraordinary requirement to natural memory (computers will take anything). But in 
chemistry, remarkably, it is not only natural but absolutely necessary for the chemists in 
order to talk to each other about chemical matters. 

Each chemical structure can be described as a list of all its “tags,” meaningful 
fragments, individual atoms, and their connections in such a way that the entire structure 
could be reconstructed from its linear description. The grammar of such linear 
descriptions of non-linear 3D-structures is called chemical nomenclature, and it is indeed 
a grammar of an artificial language used every day by chemists. More about it in any 
textbook of organic chemistry and in [4]. 

EXAMPLE. The chemical name of common aspirin is acetylsalicylic acid, which 
is a kind of old chemical slang, not quite grammatical. Nevertheless, it tags 
aspirin as containing benzene ring, and tags it time and time again as an acid, as 




29 


an ester of acetic acid and a phenol, and as something containing two adjacent 
appendages to the benzene ring. List only the tags—aspirin’s CATS—to a 
chemist and the reply will be “aspirin.” 

I do not believe that a simulation of robot-child on simplistic “chemical” 
principles, using regular consecutive computers, is a gratifying occupation, although it is 
possible. Nature is inherently parallel, but not in the sense of parallel computing as 
simultaneous execution of multiple tasks within a single problem. Neither individuals nor 
governments are good at that. I understand parallelism as translation of random or partially 
ordered collisions into communication. I regard the elementary acts of analysis and 
extraction as binary encounters of quasi-molecules: small linear sequences of sounds, 
syllables, morphemes, words, and blocks recognize each other and negotiate the outcome 
of the “collision:” deal or no deal. The outcome is recorded. 

Figure 1 illustrates the formation of two-level CATS by copy eliminations. 
Segments of different color correspond letters in the Rules of input processing. One of 
two black segments is always eliminated. The other becomes a domain of a CAT. The 
circles do not need to be either correct or even closed. 



Figure 1: CATS formation in terms of linear segments 

As a preview of the next phase of exploration, I will give here only one, rather 
weak, illustration of a possible principle of speech generation. At this point it makes little 


30 


sense because utterance starts with thought, but we do not have any settled unified 
approach to semantics. 

Suppose we have the following content with atomic ideas: 

{Kiraly dreg szeret so} = {king old love salt} 

We do not know how the generators are connected in the thought. Regardless of 
the probabilities of the generators and bonds between them (which we may never know), 
let us retrieve all relevant lines from BONDS and CATS: 


2-a;2-Oreg; 

Kiraly 

“king” 

PAUSE; STOP; SZERet; a; 

az; egy; 

Oreg 

“old” 

2-Kiraly; 

EMberek; Kiraly; 

SZERet 

“love” 

ik; (3 rd person plural definite) 
te; (2 nd person singular definite) 

2-a; 

sot 

“salt” (object) 

2-PAUSE; 


Oreg Kiraly 

“old king” 



asot 

“salt” “definite object” 



Although we do not know what the thought is, we could develop a set of rules for 
semantics. For example, old refers only to king and nothing else, but king, salt, and love 
fonn a triangle: salt is the object of king’s desire. 


Intuitively, I feel that the semantic relations can be represented by a kind of 
triangulation in a way similar to the way speech is represented by the squashed triangles 
of triplets, but I cannot substantiate this idea at this point. I hope to explore this central 
problem elsewhere. 


Figure 2 decorates the black semantic relations of the thought by red linear 


“comments” of the grammar taken from the above lines. 










31 



Figure 2. Comments of grammar (red) to semantic relations (black) 


What follows from the comments (“catalysts,” as a chemist would say), is that the 
degree of confonnity with the grammar is the highest when Oreg (old), Kiraly (king), 
szeret (love), and -ik (Present Tense) or te (Past Tense) somehow line up in this order. 
But a sot (salt, Object Case) has no definite place and can dangle anywhere. And why 


not if Hungarian has no fixed word order! Of course, a sot wedging in between Oreg 
and Kiraly would create a tension in the rather strong bond. Otherwise, any position is 
fine, but PAUSE makes the end position more probable in the context of Salt. 

The choice between -te and -ik is not decisive because of the lack of data. With 


a considerable stretch of rigor we can attribute the choice of -te to the absence of the 


end -k in Kiraly because the end -k is a practically universal marker of plural in both 


nouns and—except 1 st person—verbs. 


EMberek; Kiraly; SZERet 
“people”, “king” “love” 


ik; (3 rd person plural present definite) 
te; (2 nd person singular past definite) 


But the real solution must come from a unified approach to grammar and semantics. 
With all the liberties taken, the final output is: 











































32 


Az Oreg Kiraly szerette a sot, which seems to me grammatical enough. 

The old king loved salt. 

Note the locality of the mechanism by which I came to the above output. There 
were never more than three generators in the focus of my attention. 

Neither the last example, however, nor all the preceding tables and examples 
prove anything but the need of something more convincing. They point to the role of 
multiple repetitions of basic speech patterns during the acquisition. What can be more 
convincing in our times than a solid computer simulation? The following sideshow 
discussion addresses this problem. 


DISCUSSION 

It is already clear, in the very beginning of testing the concept of robot-child, that the 
work is going to be very cumbersome. No wonder we have big brains. Formally, it may 
involve the following computational steps: 

1. Each syllable or word in the input is compared with the entire generator space 
G and labeled as either old or new. If it is new, it is filed into G . 

2. The memory content is re-analyzed in terms of BONDS and CATS (in 
advanced stages of acquisition it concerns only a very small part of memory). 

3. Generator space G, BONDS and CATS are updated, so that each generator is 
re-tagged by all current BONDS and CATS it belongs to. 


4. The entire memory is updated regarding the age of entries. 



33 


5. The next input is rewritten (compacted from phonemes to morphemes) basing 
on the entire stored grammar. This is the same as to say that the input is recognized. 

Even this partial description looks like a description of a social system. 

Next, how are we going to use that knowledge? Suppose, we want to express a 
thought, which is a configuration represented by the list of generators (content) and 
connections (connector graph) between them. Each entry in the content and connector 
(comprised by a single matrix with a non-trivial diagonal), has a numerical or quasi- 
numerical (in terms of partially ordered set) measure. 

The thought can be a configuration of generators, sometimes incompatible, as in 
“Is that person over there in a dark overcoat a man or a woman?” Or: ’’Now you see it; 
now you don’t.” 

Each recognized generator retrieves its entire equilibrium cloud of associations 
( I am calling psychology for help) from BONDS and CATS. In other words, the configuration 
of the thought should be “decorated,” as in Figure 2, by multiple triplet quotations from 
memory. In the mind of a child the quotations are certainly not in the vocabulary of the 
English grammar. 

NOTE: My insistence on the size of linear neighborhood equal to a triplet is just 

for the sake of simplicity. In fact, the interactions between generators can be felt 

at a larger distance, which, by the way, is also a fact of chemistry. 

The crucial computation stage is to find the linear arrangement of generators that 
which has the highest probability, i.e. lowest overall energy/lowest stress/lowest 
deviation from the grammar. By grammar I mean nothing but the state of the evolving 
mind of the robot-child, with all its CATS and BONDS (I almost said DOGS: an illustration to 
the concept of equilibrium). 

Omitting the subtleties, all this promises a large volume of boring programming 
and computation. As a champion of simplicity, I am the least suitable man for this hard 



34 


labor of bug squashing, number crunching, and computer whipping. If molecules and our 
brains do it well enough on their own, let us better do it their way. 

It is not clear about the brains, but how do the molecules do it? 

There is an extremely hard (i.e., time-consuming) computational task known as 
the protein folding problem [5], in which, theoretically, each bead on a string must be 
tested in a certain way regarding its interaction with all the other beads. The solution is 
the configuration in 3-D with minimal overall energy. This task is so unpleasant (the so 
called NP-problem) because the computation is consecutive while real folding is to a 
significant degree parallel (sounds almost like Malthus). In other words, folding is natural and 
fast and our computation is unnatural and takes enonnous time. 

We are not as fast as molecules, so how can we speak and think at all? My answer 
is that this is possible because our thoughts are small, our attention span is short, our 
memory is a far cry from a hard drive, and our knowledge is limited. We can do it 
because we are imperfect. In fact, protein folding problem becomes solvable if the 
proteins are short. But the language is so big! Right, but the protolanguage was not. The 

children do not speak as layers write and even presidents can speak as children. 

Let us take a break, anyway. 

As a non-linguist, I can afford some unwarranted leaps of imagination. Thus, 
knowing nothing about the topic, I can derive the peculiar Hungarian possession marker 
from some ancient form with two articles (or, more probably, none at all), a kiraly a 
orszag, “the king the land.” The second article moves to the end: a kiraly orszaga . 

Further, I can derive the Hebrew possessive form from the same ancient pattern: 
oio , sus ha-ish , horse the man, man’s horse. The first article drops off. 

I can also throw in the English variation on the theme: the king cobra and all the 
other noun modifiers, although it is not a possessive form. 

A more corpulent fonn remains in German: Das Pferd des Mannes, The horse 
of-the man-his, or the horse the man-his, the horse of the man. 

But Hungarian catches up and even overtakes in another, reversed-Gennan style, 
fonnal possessive construct: 


35 


Az embernek a lova , the man-him the horse-his , the man’s horse. 

This is, probably, too much and modern Hungarian requires only whichever one 
article: Peter Peter lova (Peter’s horse) or az ember lova (the man’s horse) 

There is no article in Russian, but the marker is in place and you can have it both 

ways: 

Jloma/ib IleTH , horse Pete-his. 

Or: IleTHHa .ioiiia,u>, Pete-his horse, Pete’s horse (note different word order) 

Furthermore, I can fantasize that the fixed stress in Hungarian compensates for 
the wide use of the scarce possession and tense marking morphemes. The trade-off is the 
free word order. 

In English, the marker morphemes are very scarce and the word order is not free. 
In Russian, with a wide variety of markers, both the word order and the stress are free. 

But the simple fact that Polish, very similar to Russian and with an equally 
rich choice of markers, has a fixed word order, wakes me up from my sweet dreams. 

This awakening gives me an opportunity to explain once more my position. I am 
definitely not a linguist. I present neither a theory nor a working model. It is just an 
abstract idea, not yet completely clear to myself, a concept that should be tested by 
professionals, although it originates outside linguistics from a higher abstract ground of 
Pattern Theory. As a chemist, however, I feel a certain hopelessness about the current 
algorithmic and numerical methods of computational linguistics. They take a lot of work 
but prove anything but the intelligence and inventiveness of the authors. The computer 
models do not converge to a consensus unless you can test them as if they were weather 
forecasts or at least neural networks. 

I suspect that the failure or, let us hope, a long delay in developing automatic 
translation follows from the very idea of algorithm, The bootstrapping mind of an infant, 
unlike the mind of an adult, does not use anything like either statistical inference or 
algorithms, although results could be the same. 

I acknowledge that I am not familiar with the theories of computability and 
complexity. But I am aware that the tacit prerequisite of computer science is that almost 


36 


anything of practical importance can be computed within the current symbolic- 
consecutive paradigm, the success of which is tremendous and proven. But is the success 
absolute? And is it a success when it takes a lot of time and comes too late? It seems to 
me that the problems of automatic translation or robotic communication could be the true 
test for some new concepts of intelligence, all the more, they are pretty closely related to 
the Turing test of intelligence based on verbal communication. This cannot be said 
about playing chess. To pass the strong Turing test, the computer must express itself 
without help. 

Trying to formulate my idea in the most succinct way, I present it like this: we 
could possibly create realistic models of language evolution and language acquisition if 
we were daring enough to change horses in the middle of the stream and switch to a new 
type of natural computers working on ordered chaos, homeostasis, and competition for 
energy. I presented some vague and possibly not new ideas about such pattern computers 
in [5], Automatic translation with acquired grammar and lexicon could be a possible 
application and a stimulus for acquiring (what’s the heck), chemical and pattern- 
theoretical idea by young linguists. 

I can reformulate the concept in the form of an answer to the question “what is 
natural?” Natural is what has infancy. 

Being unable to calculate 2x2, such computers could be capable of computing 
the behavior and communication between children of pre-school age, who would be 
ready to start learning math and physics (not sure about chemistry), foreign languages, 
use algorithms, and operate PC. 

I am glad I will not live in such a world, but evolution does not ask for anybody’s 
consent. Probably we already have no choice. 



37 


MINIMAL REFERENCES 

spirospero.net 

1 Yuri Tamopolsky. 2005. Salt: The Incremental Chemistry of Language Acquisition 
http://spirospero.neti/Salt.pdf 
See also: http://spirospero.net/complexity.htm 
TIKKI TIKKI TEMBO: The Chemistry of Protolanguage. 
http://spirospero.neti/Nean.pdf 

Pattern Theory and “Poverty of Stimulus” Argument in Linguistics. 

http://spirospero.net/Poverty of stimulus.pdf 

The Three Little Pigs : Chemistry of Language Acquisition. 

http://spirospero.net/3LP.pdf 

2. Grenander, Ulf. 1995. Elements of Pattern Theory. Baltimore: Johns Hopkins University 

Press. 

-. 1993. General Pattern Theory. A Mathematical Study of Regular Structures, 

Oxford, New York: Oxford University Press. (Advanced work) 

-. Patterns of Thought. www.dam.brown.edu/ptg/REPORTS/mind.pdf 

(watch for updates; see also: www.dam.brown.edu/ptg/publications.shtml ) 

3. Vicsi, Klara, et al. 2004. Hungarian Speech Databases, BABEL — Multi-Language 

Database, Project No. 1304, http://alpha.ttt.bme.hu/speech/hdbbabel.php 

4. Yuri Tarnopolsky. 2003. Molecules and Thoughts: Pattern Complexity and Evolution 

in Chemical Systems and the Mind , 2003. 
http://www.dam.brown.edu/ptg/REPORTS/MINDSCALE.pdf 

or: http://spirospero.net/MINDSCALE.pdf 

5. -. 2005. Molecular computation: a chemist’s view. 

http://users.ids.net/~yuri/PTutor.pdf 

EM AIT. : http://spirospero.net/email.html Last updated March 22, 2009 


What if the words were atoms? 


















1 


Yuri Tamopolsky 


IDEOGRAM : A SIMPLETON IN A COMPLEX FAMILY 

2006 


"What giants?" said Sancho Panza. 

"Those thou seest there," answered his master, "with the 
long arms, and some have them nearly two leagues long." 

"Look, your worship," said Sancho; "what we see there are 
not giants but windmills, and what seem to be their arms are 
the sails that turned by the wind make the millstone go." 


"It is easy to see," replied Don Quixote, "that thou art not 
used to this business of adventures; those are giants; and if 
thou art afraid, away with thee out of this and betake thyself 
to prayer while I engage them in fierce and unequal 
combat." 




Cervantes, Don Quixote. (1605) 


In modem times the importance of the atomic theory is 
even more evident in political than in physical science. 


Hegel, Encyclopedia , §98 (1817) 




















2 


PART 1 : GIANTS AND WINDMILLS 


My investigation at complexity and simplicity is gradually taking shape and purpose 
in my own eyes. This e-paper is an opportunity to share the still fuzzy vision. 

I am interested in evolving complex systems: life, mind, society, language, economy, 
science, technology, politics, culture, ideology, art, literature, and everything that human 
history comprises. In particular, I am interested in the limits of our knowledge about such 
systems. I am not interested in a theory. I am looking for understanding. 

I distinguish between knowledge and understanding. Knowledge can be generated, 
acquired, tested, communicated, modified, falsified, and confinned independently by 
different explorers who usually come to the same or close conclusions at the same 
circumstances. Scientific knowledge stored in natural (physical) sciences is reproducible 
and lasting. We know how nature behaves and we are expanding our knowledge. 

When we deal with very large physical systems, our knowledge can be incomplete and 
approximate, which is not such a big problem if we know the margins of incompleteness 
and approximation. In humanities and social sciences consensus is an exception rather 
than the rule. The inherent fallibility of social sciences is the main tenet of George Soros’ 
view of the world, shaped in long, hard, and truly experimental body of work. His book 
The Age of Fallibility: The Consequences of the War on Terror, Public Affairs, NY, 2006, 
contains illustrations valuable regardless of his personal position. 

Science itself evolves. It is a large evolving complex system. When we deal with large 
evolving complex systems (ECS- or X-systems), our knowledge of their behavior is 
incomplete because of their size and complexity. This is not a problem in itself 
because we usually ask questions on a limited and practical scale, unless in philosophy, 




3 


where truth matter as much or as little as in art. The main difference of X-systems, as 
compared to physical systems, comes from the factor of evolution: the phenomenon of 
novelty is something physical sciences do not observe in their objects. Novelty, 
however, matter a lot for a historian of science. 

Our knowledge of X-systems is most complete when it is least useful, i.e., when it is 
about the past and, to a lesser extent, present. We even may know what is going to 
happen in the future, but almost never know when. 

The science of complexity, as the subject of X-systems is usually denoted, was initiated 
and advanced either by physicists and physical chemists, or by biologists and sociologists 
who were applying the apparatus of physical sciences to life, society, mind, and even 
history and culture. Probably, the most significant result of the over half century of 
science of complexity is the wealth of work in the absence of consensus. I would draw an 
analogy with perpetuum mobile. 

The physical sciences do not give us any exact knowledge of X-systems, except for 
about a page of general principles of thermodynamics. This small package, however, 
contains a true gem of knowledge: the so-called dissipative systems, some very simple 
and some as complex as X-systems, need a stable source of energy to sustain their size, 
complexity, and evolution. This energy (free energy, to be exact) is irreversibly 
dissipated into heat, with the difference retained as order. 

While most living species use the sun as the ultimate source of energy, modem human 
civilization, in addition to the sun, relies on limited or hard to get sources of mineral fuel. 
Their exhaustion will unavoidably bring big changes, but when and of what kind? This 
question remained purely academic until recently. 


4 



Figure 1. Evolution of waterwheel (A-C) and windmill (D-F) 

A miller’s windmill (Figure 1 D-E) is an example of a device transforming the energy of 
the sun—through the intermediate energy of the wind—into mechanical energy for 
processing grain, another form of stored solar energy needed to sustain human existence. 
Today this antique contraption is being resurrected and reengineered for generating 
electricity needed to sustain the civilization of things. 


The watermill (Figure 1 A-C) was historically a much earlier device, but wind is 
capricious and omnipresent, while a creek is reliable but of a limited availability. 
Ultimately it, too, is driven by the sun creating the turnover of water. 


I use the remarkable story of the recent windmill awakening as a pretext to pose the 
following question: could antique forms of social organization, for example, feudal 
system and monarchy, or even the tribalism, still not quite extinct in the world, be 
resurrected and redesigned if mineral fuel was exhausted, or for some other reason? 


I ask that question—quite topical, in my opinion—to illustrate thinking in terms of 
patterns, which are much more resistant to erosion by time than particular social 
configurations. No wonder they are hard to kill: they are abstractions and nothing 
physical can touch them. 

I can add another question. 

The so-called war on terrorism is a war between two systems, one tribal terrorism and the 
other the powerful American Republic. Theoretically, the most powerful one should win. 



5 


But why is it on defense—if only temporarily—rather than in a victorious sweep it has 
proved to be capable of? Thinking in such terms, just from the considerations of 
symmetry, one may conclude that there are some very ancient and weak components in 
the American social system or its subsystems. Conversely, there are strong sides of tribal 
society capable of launching suicidal bombing on unprecedented scale. 

Taking another angle, politics is always tribal and sometimes even suicidal, to which One 
Party Country by Tom Hamburger and Peter Wallsten (2006) and the midterm elections 
of 2006 provide scores of illustrations. 

The years of the Republican Revolution remind me descriptions of carefully calculated 
multi-step chemical syntheses. The Republican chemists who saw the inherited (initial) 
and the desired (final) situation designed the step-by-step process like chemists from 
Pfizer or Merck, drawing a chain of small transitions from the starting reagents to the 
final profitable product seen in imagination. 

Furthermore, modem democratic political and social institutions, such as governments of 
big nations, are large evolving complex systems, too, with a lot of order maintained 
against a lot of chaos, and they need a constant supply of energy in the form of tax 
revenue and donations. 

In Figure 2 I equipped the two seats of power, US Capitol and Russian Kremlin, with 
devices for extracting the necessary energy from wind, the inexhaustible (at least in 
thousand years) source of energy. The Eiffel Tower needs just a touch to become a 
windmill, but it is not a seat of power. 

An interesting although less timely question is how much the political system could 
change on the supply of energy independent of geopolitical factors. This question itself 
leads to the next: is it possible to have some understanding of whether we can look for 
scientific (i.e., based on approximate consensus) answers to such questions. 



6 



Figure 2. A vision of tax-free governments 


I choose “windmill” to be an ideogram, i.e., in this case a label 
for a pattern of a device for extracting energy from long lasting 
source. I prefer it for its simplicity, but it can serve as a label, a 
brand trademark for the main condition of any X-system: 
they feed on energy convertible into work (free energy), and they need a lot of it. 

The reason why I do not use the term pattern is because it is already used in Pattern 
Theory (Ulf Grenander) in an exactly defined meaning, while I avoid exact definitions 
and statements. Counting on understanding rather than knowledge, I tend to recur to the 
subtle and indirect tools by which art commonly conveys its message to little prepared 
public. Thus, I believe that the series of photos in Figure 1 do not even need any caption 
to convey the message. They speak for themselves. 

In the language of Pattern Theory ideogram is a simple configuration chosen 

as a template of a pattern. 


LEVI STRAUSS&co. 

>— < *.•' - 

quauu fioT^o-xx 














7 


I see ideograms as symbols for universal patterns of building blocks, stratagems, and 
tactical tricks by which X-systems keep evolving in size and complexity. 

Very approximately—to turn to original work of Ulf Grenander would be the best—the 
potato shape is a pattern, while individual shapes are configurations. To describe the 
potato shape pattern, we choose a typical potato contour as a template and define the 
pattern as the template and all its deformations that preserve the potato shape. In case of 
potato and many other visual images the deformations can be expressed mathematically 
in terms of group theory. In the case of ideogram we have to drag it through various X- 
systems, which I doubt can—or should—be expressed through equations. At the level of 
understanding, however, we can probably do well enough without rigor. 

Back to “windmill.” In a sense, all life, and the kingdom pf 
plants in particular, is a system that extracts energy from the 
inexhaustible source of solar radiation. Plants are all 
“windmills” in the above sense. In Figure IF we can even see 
how the wind turbine borrows the architecture from a tree. But 
there is a radical difference between the windmill and life: the windmill is a thing and life 
is a system. Energy and materials must be spent on making a windmill, but after that 
only minor maintenance is needed. The windmill is a solid thing with some degrees of 
freedom. Life is a steady state of flux: it needs a constant inflow of energy. 

If there is no wind, the windmill can quietly wait for a long time and then resume its only 
slightly less monotonous existence. If there is no light for a long time, the plants will die. 
There are various microorganisms that never see light but generate energy from chemical 
bonds of their substrate. We do the same, obtaining our energy from chemical bonds of 
the food and oxygen, which puts us under the same pattern as mushrooms (fungi, now 
regarded as a separate kingdom of life) and some plants that lost the ability to utilize light. 
This is a very intriguing and relatively underdeveloped topic, which I regretfully have to 
abandon. 





8 


I prefer the windmill to photosynthesis as an ideogram because the latter is complex and 
invisible with the naked eye, while the unpretentious windmill can be seen in all its 
intimate simplicity. Diminished by the threatening complexity of the world, we could 
benefit from a large type book with pictures to start growing up. 

We can see dramatic changes in art and culture over the last century, but what do they 
mean and where do they lead? We begin to see the repeating alternation of social and 
political patterns in modern history, but is there any kind of explanation why and when 
they change and what is their taxonomy? Those are examples of problems I am trying to 
understand without expectation of any scientific knowledge. 

By understanding I mean the ability to discuss such questions in a kind of lingua franca. 
We understand something when we can share our understanding with somebody in 
common language and terminology and get the proof of understanding. This process is 
nothing but the traditional method of teaching and learning by communicating a doctrine, 
receiving the feedback by asking test questions, giving another round of explanation, and 
so on until equilibrium. 

The multiple choice exam is an imperfect but expedient method of getting the feedback, 
while a live discussion in a narrow circle of scientists was the decisive way of creating 
modem physics and chemistry. Starting with molecular biology, however, science 
acquired the postmodern competitive pragmatism and protective caution. Times have 
changed and any idea got a price tag. 

The difference between knowledge and understanding is best of all illustrated by the fact 
that we can understand false, fantastic, and ridiculous doctrines, while knowledge, ideally, 
is either testable or has testable limits. Both knowledge and understanding can be 
imperfect. 

Understanding means a way to consensual knowledge, and if the exact knowledge is 
impossible, then to a knowledge about its limits and predictive ability. Understanding is 



9 


inherently dependent on the language: it is the matter of common language. Ideogram can 
play the same role as Chinese characters that ensure mutual understanding by speakers of 
mutually incomprehensible dialects. That was where I got the idea. 

Since Aristotle, the traditional way of science has been analysis of a complex object in 
terms of its simpler parts and features. The synthesis consists in an arrangement of 
components in a spatial and temporal order. 

A complementary way, which is neither analytical nor synthetic, goes from universal 
patterns to particular configurations. As a scientific method it is hopeless because in 
science patterns are derived from configurations and configurations from observed 
images, but it gives a holistic view of the world to be understood. Moreover, it makes 
possible invention of new configurations (a new kind of potato) and even pattern 
(genetically engineered fish-and-chips). 

Thinking about all that as a chemist, I saw since long a big difference between the way 
the world was perceived by physicists and chemists. Physicists operate in the realm of 
continuous values, with the exception of quantum objects, while chemists deal with 
individual discrete structures. Thus, energy, temperature, and velocity can take any 
values, while graphs, i.e., combinations of points and lines regardless of distance and 
angle, do not form a continuous set and do not have any single natural way of ordering 
similar to the integers, time, or space. 

We enter a mixed atmosphere in the physics of elementary particles. The quantum 
physics, however, deals with statistical ensembles, while there are no statistical 
ensembles of Shakespeares, Hitlers, or the USAs on November, 2006. There are no 
ensembles of French Revolutions and Collapses of Soviet Empires, either. All the wars, 
revolutions, and collapses do not form any ensemble because the numbers are small 
(hundreds) and the times are different. Yet there could be some way to understand 
revolutions and collapses, i.e., develop a temporary consensus of how to talk about them. 
Numerous—but not too numerous—attempts have been made. 



10 


The difference between the sciences today is as relative and diffused as never before. 
There is only one science and the habits of segregation stay on the way of understanding. 
My intent at spirospero.net , therefore, is to explain and illustrate the chemical vision 
of the world as a contribution to universal understanding. 

Over a quarter century ago, one of my initial motivations to look at the world with 
chemical eyes was my inability to find a common language with an exceptionally 
intelligent theoretical physicist who could make any complex topic of physics 
understandable to me, but I could not explain why I saw big gaps in a physical 
approach to life and society. Physics was structure-blind and I was tongue-tied. 

The new, decisive, and radical step toward developing an abstract mathematical structural 
vision of the world—a chemical vision of the world outside chemistry—was made by Ulf 
Grenander in Pattern Theory, to which I have already referred many times. Comprising 
both continuous and discontinuous deformations of configurations, Pattern Theory 
bridges the physical and structural aspects of the world in the most abstract and universal 
way. 

I see my role as a freelance explorer of giants and monsters, prodding them with the 
purpose of proving that they are just windmills, what Don Quixote might have done after 
reading Don Quixote. 

I have also another, purely quixotic, idea. The complexity of modem science puts a high 
barrier on the way toward mass education. I believe that science could be taught in a new 
way, as science of everything, with subsequent specialization along the way. Thus, my 
seven year old granddaughter studied rainforest at school. In principle, this subject 
involves all sciences and even politics and is a good syncretic start for a synthetic science 
of everything. The study of science, however, never becomes synthetic along the way and 
splits into subjects that, all but one chosen, could only be touched upon the surface, with 
the substance too difficult and boring for most students, especially, in our hedonistic 
culture of hurry and waste. The unity of the world is lost. 



11 


PART 2. STAIRS AND RAMPS 


In Russia I became allergic to Marxism, forcefully fed by it all my life. In America 
finding a lot of reverence toward Karl Marx initially jarred me. Some components of 
Marxist indoctrination, however, became ingrained in my mentality; among them class 
structure and social justice—probably because they were not the monopoly of Marxism. 

While clearly seeing a great expanse of inequality in America, I was unsure about the 
class structure of the country. Such often used terms as middle class and upper middle 
class seem to float in the air, cloudlike, while the lower class foundation and the upper 
class roof of the social edifice appeared to be made of a different, heavier and more dense 
material, constantly in public view, but theoretically inconspicuous. The very first look 
into American view of the problem told me that the consensus regarding class structure 
was nonexistent. I saw that as an opportunity to apply, finally, my new mental gadget, 
ideograms, to the problem within the framework of the strictly individual project a 
chemist's view of the world, to which the notion of consensus did not apply. The 
problem certainly looked to me as a giant, and I had to show that it was a windmill. 

In this section I give another illustration of what I understand as: 

Ideogram: a combinatorial element, pattern, building block, or an abstract feature 
of X-systems represented (labeled) by its simplest or most common embodiment. 

For an expanding collection of ideograms, see complexity and Essay 47. The War , 
where I use surface tension as an ideogram. Essays in simplicity contain many 
examples of distant parallels that can serve as seeds of ideograms. Thus, Essay 13. On 






12 


Numbers applies the mathematical concept of partially ordered set to Confucian ethics. 
Essay 14. On Taking Temperature with a Clock deals with measuring temperature of 
music, Essay 38. On Football compares World Soccer Cup and American criminal 
justice. 

The very purpose of the notion of ideogram is to use simpler objects as templates for 
more complex ones. 

This is another example of ideogram: 

Stairs: an ideogram of an evolutionary sequence of discrete structures of 
increasing complexity emerging on the continuous increasing supply of energy. 

Figure 3 compares two types of rising structures: continuous (ramp) and discrete (steps). 



D E F 


Figure 3. Continuous (A-C) and stepwise (D-F) vertical mobility 

















13 


That class structure can be represented by any stepped structure has been quite obvious. 
Figure 4 show two out of many variations on the theme. 



Figure 4. Old cartoons of Pyramid of Capitalist System (A) and 
Russian Social Structure (B). The layers are labeled, bottom to top: (A) “we 
work for all” (left), “we feed all” (right); “we eat for you;” “we shoot at you;” “we 
fool you;” “we rule you;” “capitalism” [money bag]; (B) “We work for them while 
they...” “.. .shoot at us”; “.. .eat on our behalf’; “.. .pray on our behalf’; “.. .dispose 
of our money”. 


I remember a different Russian version from a textbook of history, which I could not find 
on the Web. The lowest and largest level was made up of peasants (“we feed you”), 
workers were the next (“we work for you”) and the czar and czarina were on top (“we 
reign over you”). 

Obviously, the American society does not have the 
rigid social class structure in the sense of Figure 4. But 
what does it have that the economists and sociologists 
are so confused about? 























14 


I choose stairs as a version of stepped structures to label the ideogram of class structure. 


The man-made things, like biological species and larger taxonomic units, do not have 
continuous intermediate forms. The reason for that is the discontinuous nature of 
structural complexity. There are no transitional forms between two different molecules. 
In short, structural complexity increases when the number and variety of building blocks 


increases. I symbolically denote 

a series 

of increasingly complex objects as the 

sequence: 




A 

0 

B 

Hr ^ 

1 

2 

3 

4 5 


On complexity, see The New and the Different , where I suggested two kinds of 
evolving complexity: difference (new combination of the same blocks) and novelty (a 
combination with a new block). In terms of Pattern Theory block is generator. 

Complexity in X-systems increases stepwise. One might say that integers also form stairs 
of a kind. The difference is that the scale of integers is infinite, uniform, and is suitable 
for representing continuity, while the steps of the evolutionary scale, however numerous, 
are limited in number, are all different, and each can represent a very big leap. 

Simplification seems to be as much a property of evolution as complexification. I 
would take evolution of electronic data storage devices, from magnetic tape to 
floppy disks, compact discs, and flash memory as an example. I am not sure, 
however, that a thorough investigation would lead us to this conclusion. The area 
of structural complexity (understood differently in different areas) has only 
recently entered a stage of expansive growth. 

See an introduction into the problem: Francis Heylighen , The Growth of 
Structural and Functional Complexity during Evolution. Search for evolution of 
complexity, structural complexity, and evolution of novelty. 







15 


In Figure 5A I arrange five sources of electrical current in order of increasing power, 
symbolized by pictures of small battery, car battery, ungrounded wall outlet, grounded 
wall outlet, and industrial switch. 

While the scale of increasing current power (resource) is continuous, the complexity of 
man-made devices is discrete: it has no intermediate forms. Thus, there is no common 
intermediate form between a battery and a rechargeable battery (accumulator), battery 
and a residential current source, etc. 



A S I f 

1 2 3 4 5 



Figure 5. Complexity versus resource 

Figure 5B generalizes the relation between complexity and a continuous property, which 
should not be perceived as a law of nature or any function, either continuous, or defined 
over a continuous argument, or both. I can only hypothesize—referring to George Soros 
as a witness—that in the realm of evolving complex systems (X-systems) we should not 
expect anything exact and definitive. To all arguments of George Soros in The Age of 
Fallibility I add one, in my opinion, most important: novelty, which by definition cannot 
be fully (or not at all? an intriguing question!) anticipated. 

































































16 


The answer to the question how we can develop any knowledge without numerical values 
is: we can very frequently, if not always, compare any two objects and determine which 
of them has more of some property than the other. 



Figure 6. Resource, complexity, and system. 
Linear presentation 


I do not know how we can generalize structural complexity to fit a universal measure (I 
believe we cannot), but I know how we can generalize what I call “resource”: as energy 
or money. The latter is a universal systemic currency for social systems and their impact 
Biological systems have their own money in the form of ATP (adenosine triphosphate). 





















































17 


Since we are dealing with systems, by energy I mean the energy needed to create a 
system and to maintain it. This is true about subsystems, even if they are “dead.” 

Without maintenance even Egyptian pyramids will disappear in some distant future. We 
deny maintenance to our computer—and scores of other things that serve us—by 
discarding it. We typically cannot deny maintenance to our body, dwelling, and place of 
work, however. Some systems, like economy, release part of the consumed energy as the 
production. 


Complexity 

Influence 

Power 




fieiMiityl 


Resource 

Energy 

Cost 


0 $ $$ $$$ $$$$ 

1 2 3 4 5 


Figure 7. The stairs of power 

Figure 6 gives more examples of stepwise increase of complexity and available power: 
fist, gun, armed group, corporation, and White House, as well as five American non¬ 
profit organizations: Amnesty International, American Association of Retired Persons 























































18 


(AARP), The Brookings Institution, and American Heart Association, in order of their 
cost or budget (source: Capital Research Center) . 

Figure 7 illustrates the relation between complex systems and their common resource. It 
also shows that the power of a system, i.e., its ability to function in a larger system, 
depends on its complexity and the latter depends on the original resource. Figures 6 and 7 
illustrate the great range of power. 

A subsystem of an X-system is, therefore, a transponder of a resource into usable and 
dissipated forms. Power is the ability to provide and allocate the resource. Thus, a 
corporation or an individual who donates money to a non-profit organization (or pays to a 
hired killer) provides the resource and trusts the recipient with its allocation. 

What does it mean to allocate the resource? A chemist imagines the final result of an 
experiment, which is always possible in chemistry because of its closed generator space 
(i.e., the atomistic Periodical System), and moves toward it with a significant expectation 
of success, employing the enormous and ever growing chemical knowledge. A failure 
often leads to a new knowledge. Usually there are many possible ways to a goal, and if 
one ends up in a blind alley, another may bring more luck. 

A politician can see the final goal, but the goal can be very schematic and the knowledge 
illusory. A businessman is in a better position because of the available body of 
knowledge, which does not guarantee a success. But the political knowledge in our 
impetuous time is more illusory than anything else. 

Nevertheless, many politicians are sober and realistic people. They understand 
that the more ambitious the goal, the more time will it take to reach it. Meanwhile, 
the politician can achieve a smaller, but more attainable and more selfish goal. 



19 


I have neither qualification nor intent to go any deeper in this subject. All I want is to 
express the idea of the discrepancy between the continuity of resource and the stepwise 
evolution of man-made systems. 

Next I will try to encroach upon a completely alien to me territory of the actual class 
structure in modern capitalist society. 

In Figure 8 only the abstract complexity and resource are left. 

The chemical picture of the world requires me to modify the design of the social stairs by 
adding a transition barrier from one state to the next—a universal property of every 
change and one of the few main tenets of chemistry. Very approximately, I see it as the 
one-time additional quantity of resource (initial investment) needed to create a system of 
a higher complexity, which will, hopefully, sustain itself. 

I wish to emphasize that all my judgment regarding economics 
and finances (as well as numerous other subjects) is strictly 
amateurish and generously augmented with creative ignorance. 


What follows from the above view of the problem is the illusory character of a classless 
(in economical sense) society. One does not climb the small numerous stairs, a dollar a 
time, to the next floor. To put it differently, here is no social ramp to the top, although, as 
biographies of some future billionaires testify, a kind of a ramp leads to the first step. To 
ascend further, one has to enter an elevator for a fee, Figure 8. 

The limited character of available at any moment resources changes the simplistic 
realtion between complexity of the goal and the resource. The higher one rises, the 
higher the elevator fee, Figure 9. 



20 


Figure 8, however, misses the most important property of the distribution of wealth and. 
probably, all specifically human values: the higher, the more rare. This class of 
distributions is known as power law and under various names depending on the area of 
application. Wealth in capitalist societies is distributed along Pareto distribution. 

For our purpose we can interpret the dustribution as in Figure 9: each next step of the 
complexity ladder is higher than the previous one. This does not directly follow from the 
Pareto distribution, but, probably, explains it. The common perception of life is that the 
beginning is difficult and the top is hard to reach. As far as the distance between the first 
and the last steps, it seems like a ramp. Figure 9 metaphorically represents this illusion of 
the beginners, most of which will end up in the middle, clinging on to what they see as 
the slippery ramp. 



Figure 8. The chemical view of vertical mobility. The stable 
positions are local minimums of cost. 



















21 



Figure 9. Increasing price of power and 
complexity 



with a very complex structural form is taken i 
faster will the cumulative costs grow.” 


A casual illustration of the cost of 
complexity is given in Figure 10 
borrowed from: Eli as Orrman, 
Structural Complexity of Electronic 

Records as a Factor Guiding Decisions 

on Permanent Retention. The paper 
deals with the problem of costs and 
benefits in archival storage of electronic 
data records. The author does not specify 
what complexity of a record means and 
simply states that “The more material 
i permanent/long-term preservation, the 


I believe that complexity of modern society and its documentation is one of the major 
perils of democracy. The US Tax Code or congressional appropriation documents 
















































22 


unreadable because of their size are best examples of complexity for which mad or 
insane seem to be exact scientific terms and Byzantine too far off. This is one side of the 
problem. The other side is that there is no reason to expect the understanding of 
extremely complex problems from the elected officials, candiadtes, and the voters with 
the ever pressing election needs. Two years between American elections is to short a term 
to understand the problem and long enough to lose the control over fast pacing events. 


I am going to add another example because of its intense eloquence. 


Source: Steve Burbeck. Complexity and the Evolution of Computing: 
Biological Principles for Managing Evolving Systems (2004). 

Complexity in the digital world is beyond our control.... 

Computing systems are seldom designed these days, they evolve and as they do 
so they become ever more Byzantine and complex. ... 

Civil engineers who create steel bridges have a saying that "rust never sleeps." 
The comparable maxim in computing ought to be that "complexity never sleeps." 
And, once complexity is out of control, it takes control. Computing professionals 
work tirelessly to reduce complexity but all too often their efforts actually 
exacerbate it because the already complex systems are far beyond our 
comprehension. ... 

IT professionals expend substantial resources detecting and cleaning virus and 
worm infections, patching machines, modifying firewalls, updating virus detection 
software, updating spam filters, and the like. ... 




23 


CONCLUSION 

My investigation of class structure in America should not be taken too seriously. My 
main intent was to add another couple of 
examples of ideogram as a tool of 
understanding the world which is 
commesurable with our lives and observable 
directly, without either telescope or 
microscope. The stairs are a simple thing, 

Dopey among the Seven Dwarfs. Widely used 
as a metaphor, it labels, however, an important pattern of many not completely 
understood X-systems and is chosen exactly because of its simplicity and observability. 
So is windmill. 

I have omitted here many important points and references. More than half-seriously, I 
believe that there is no use of a complexity treatise which is more complex than 
complexity itself. 

The Marxist indoctrination did not make me a Marxist, but it imparted on me a lasting 
interest in Hegel, who was regarded, along Lenin, as one of three sources of Marxism. 

I conclude this essay with a wonderful ideogram Hegel used for the inherent 
discontinuity of devolution: knotted rope (“knotted line”). 

This process of measure, which appears alternately as a mere change in quantity, 
and then as a sudden revulsion of quantity into quality, may be envisaged under 
the figure of a nodal (knotted) line. Such lines we find in Nature under a variety 
of forms. (Hegel, Encyclopedia, $109) 





24 


Here it is: 



Main page 


Created: December, 2006 


Last modified: December, 2006 



Do Piraha speak Nean? 


Yuri Tarnopolsky 


The war of words is a metaphor that should be taken literally in linguistics. 

In this essay I am omitting the history of the long war between the formal linguistics of 
Noam Chomsky and his critics because the linguists know the story and the non-linguists 
can turn to the well written article in The New Yorker. The Interpreter by John Colapinto. 
The war remains as unfinished as the Iraq war, but the general state of things is more or 
less clear: the Gullivers lose to the Lilliputs. 

The article in The New Yorker is an excellent piece of journalism. It can be 
complemented by many available on the web professional materials googlable as: “Dan 
Everett” Piraha OR Piraha. I have no say in a professional linguistic discussion, but 
my point of view may be of interest for the seekers of new ideas. 

John Colapinto actually traveled to the Piraha tribe in Amazonia together with Dan 
Everett, a linguist and an unordinary personality, who is an unrivaled speaker of the 
exotic language of the tribe. 

John Colapinto writes (p. 130): 

When I asked Everett if the Piraha could say, in their language, "I saw the dog 
that was down by the river get bitten by a snake," he said, "No. They would 
have to say, “I saw the dog. The dog was at the beach. A snake bit the dog.” 



2 


I recognize in Everett’s transposition the language that I called Nean in my web 
publications: 

1. Pattern Theory and “Poverty of Stimulus” argument in linguistics (PoS) 

2. Tikki Tikki Tembo: The Chemistry of Protolanguage (Tikki) 
and others at complexity . 

I am a chemist and not a linguist, although I am superficially familiar with the basic 
structure of several very different languages. My interest in languages is part of a larger 
program “a chemist’s view of the world” pursued on my website . 

I see chemistry as one of the two most romantic sciences; linguistics is the other one. 
There are many parallels between them, which I explore at my site. Thus, chemistry 
uses an artificial language that compresses the rich three-dimensional world of 
chemical structures into a strictly linear sequence of symbols, exactly as our language 
does when we describe an elephant or the global warming. “The language of DNA” is 
not a metaphor, but a standard term in molecular biology. Mark Baker, an eminent 
linguist, entitled his book “The Atoms of Language.” 

Here is just one example, which I never used before. One of the starting assumptions of 
the entire theory of Noam Chomsky about universal grammar was the ability of 
children to construct correct sentences never heard or corrected before (the so-called 
poverty of stimulus argument). An average chemist during his or her life brings to 
existence large numbers of chemical substances that never existed before, probably, 
even in the entire solar system. Each gets a unique name that serves as a photo ID: you 
can draw the molecule from the name and materialize it in the lab, if you wish. 

I am a father of several dozens of never before known substances, but there is nothing 
to be proud of: anybody can do that. Does it mean that I have some innate knowledge of 
chemistry? Of course not. Chemistry had even denied a possibility to make some of my 
molecular children before I actually did it. 






3 


I and the rather happy looking and attractive Piraha Indians, whose photos can be seen 
in The New Yorker and on the web—we all have an innate ability to understand the 
world and acquire skills and knowledge. If the adult Piraha cannot learn counting, it 
means that they either do not need it or the teacher cannot find the right key. If I am not 
mistaken, some of our very distinguished citizens could not master the Internet, either. 
Inability to learn the new lessons of history is even more common, and for the same 
reasons that Dan Everett attributes to the Piraha: “This is new stuff and they don’t do 
new stuff.” 

Most normal people in this hemisphere are not only incapable of learning chemistry but 
consider it the most incomprehensible and even repulsive area of knowledge. There are 
plenty of other occupations, for example, making and counting money. Some of our 
top national leaders occasionally fail even at their supposedly innate language skills, 
instead relying on somber language of power—a language much more poor, color-blind, 
and primitive than even the language of the Piraha. 

But what is Nean? In short, it is speaking in triplets or even in doublets. “I see dog. 
Dog was beach. Snake bite dog.” This is a good example of Nean grammar. It does 
not sound good, but all first contacts between different tribes and nations started with 
learning a pidgin version of the other language. The study of pidgins is a separate 
branch of linguistics. The fact of great importance is that we understand “I see dog” 
and “Dog was beach,” especially, in context. It is probably obvious to any Amazonian 
that snakes are common on beaches. He can explain to the crooked head: “Snake was 
beach” or simply “snake beach! snake beach!” “There are many snakes at the beach” 
sounds like “snake snake beach! snake beach snake snake!” 

Nean, in my view, is not exactly a language, but the primeval natural grammar. That 
was my first thesis of an outsider when I got interested in the origin of language some 
years ago. 

I suggested in PoS and Tikki that Nean is a grammar of such simplicity that it could 
originate spontaneously by self-organization. This follows from my perception of 



4 


complex systems: any natural complex evolving system, such as life, society, culture, 
technology, etc., starts spontaneously as a system of minimal complexity. Then it 
grows, evolves, beefs up its complexity, and diverges. Only very simple things in very 
simple systems can happen without the input from a deity, human, or alien. A group of 
monkeys cannot type Hamlet, but they can certainly type by accident the name of 
Hamlet. Names like Bush and Gore are even more—and equally—probable to pop up in 
such experiments. No political allusion intended. 

NOTE: At a closer look, to type “Bush” may require less physical energy than 

to type “Gore.” “Gore” takes longer jumps from key to key. But this is arguable. 

What matters is the approach to language as to a natural process. What takes 

less energy is more probable. 

On the contrary, if a system is created by human mind from some existing material, it 
may not need simplicity. Bureaucratic systems are the best example. It has been noticed 
that our appliances are getting more complex and less reliable. The natural history of 
bureaucracy as pattern, however, also had to start with something very simple, such as 
just counting or policing. Moreover, counting also had to start with counting to two, 
further proceeding to tree, five, and ten or twelve. 

In order to advance in mathematics further, you need a mathematician. The same with 
language: you need a speaker and a writer. But you need Demosthenes or Cicero only in 
a developed society. They are not expected to be good hunters in the jungle. 

My second thesis was that the system consisting of the speakers and the social 
environment maintains the state of homeostasis, which, probably, is the case with the 
Piraha tribe. There is no natural language separate from nature. Language is the device 
for maintaining the homeostasis of the tribe. If it evolves, it is because the homeostasis 
has been disturbed. If not, people can manage with what they have. 


My third thesis regarding language was that we have two very different kinds of 
languages. One (Language 1 or hypolanguage) is the language spoken today in a much 



5 


more advanced form only by some uneducated and illiterate, but otherwise completely 
normal, gentle, skilled, and caring people, albeit with limited experience, which we can 
find only in particular areas and social strata of the world, even in quite civilized 
countries. It is also the language of children before some age and level of schooling. It is 
also the language of new immigrants who are not as articulate as Noam Chomsky, but 
their children may become one. It consists of short segments of speech, often repetitive 
and overlapping. It is practically free of embedding and recursion. It is simple because it 
is difficult. For the same reason Hamlet is more difficult than “Hamlet” and “Hamlet” is 
more difficult than “Gore.” But the cardinal fact is that we understand Language 1 and 
people can successfully communicate in it, up to a point. 

The second language (Language 2 or hyperlanguage), that of long composite phrases 
with many subordinate clauses, is the result of advanced evolution. It is a bulky human 
artifice, like HMS Queen Mary II. It was gradually created, by trial, error, and mutation, 
in order to represent the growing complexity of civilization, including complex ideas and 
constantly shifting homeostasis. It is the language of science, literature, and bureaucracy. 
It has nothing to do with the origin and essence of the language that all humanity speaks, 
but everything to do with culture, society, economics, and maybe a desire to show off. 
On the negative side, it is also the truly barbaric language of the fine print and the US 
Tax Code—the language so alien, that we pay big money to professional translators. 

With all the fuss around the Piraha, I feel a need to find some new non-professional 
arguments, just to convince myself. 

All Soviet college students in my time were mobilized to spend at least a month doing 
some agricultural work—a kind of forced labor. While doing time in the Russian 
countryside, I witnessed the aboriginal Russian and Ukrainian language, which used 
probably one percent of the grammatical resources (and sometimes 200% of the foul 
vocabulary for the male speakers). It was not a local dialect or slang, but the natural 
speech of some people of mostly older generation, sometimes almost illiterate, and bom 
before the introduction of the comprehensive system of education that the Communists 
had among their undeniable achievements. 



6 


I of course do not have any examples. I remember, however, that in spite of its simplicity, 
the language was by no means handicapped. It was very expressive, for which there are 
incredibly rich lexical and morphological resources in both Russian and Ukrainian. It was 
in homeostasis with environment. It served its purpose and function. I understood the 
speakers and they understood me. 

Stimulated by the story of Piraha, I decided to look at the records of Russian folklore that 
could open an audio channel to the past.. 

There is a Russian site dedicated to Russian literature and folklore: http://feb-web.ru/ 


Here is an example of folk poetry collected in Russia in the middle of the eighteenth 
century by Kirsha Danilov (in my translation). 


'TOH eCH Bbl, KHH3H H 6oapa 
H MOi ynne SoraTbipn! 

Bee Bbi b Khcbc nepe>KeHeHbi, 

ToJIbKO a, BjiaUHMCp-KIIH3b, XOJIOCT XO/h'y, 
A h xojioct a xoacy, noKcuaT l yjuno, 

A KTO MHe-Ka 3HaeT COnpOTHBHHUy, 
ConpoTHBHHuy 3HaeT, Kpacny ucBiiuy: 

Kaic 6bi Ta 6bma acBnua cTaHOM cTaTHa, 
CTaHOM 6bi CTaraa h yMOM CBepmHa, 

Ee 6ejioe jihh,o xax 6bi 6ejioh CHer, 

H arojmubi xax 6bi mbkob n,BeT, 

A H HCpiIbHi 6pOBH KBK COGOJIH, 

A h acHbia ohh xax 6w y coKOJia". 


Hey, you, princes and boyars, 

And mighty heroes! 

You are all married in Kiev, 

Only I, prince Valdimir, remain single, 

I remain single, walk unwedded, 

And who knows a mate for me, 

Knows a mate, a handsome girl: 

Be that girl slander of figure, 

Slender of figure, clever in the head, 

Her white face like white snow, 

The cheeks like the scarlet poppy, 

Black brows like sables, 

And the lucent eyes like the falcon’s. 


The only possessive pronoun is highlighted. Otherwise, there are no obvious 
grammatical connections between the lines. There is no embedding. The short segments 
are brought together by the repetitions that look like the “pre-existing condition” of 
haplology, which I discuss in Tikki \ 


The lines look ready for haplology contraction: 

And who kn ows a mate for me, Knows a mate, a handsome girl: Be the girl 
slender of figure, Slender of figure, clever in the head, Her white face like white 
snow. 




7 


They can contract into the single phrase: 

And who knows a handsome girl with a slender figure, clever head, and a snow 
white face, who could be a mate for me? 

The folklore is not what we can call natural speech. It is a one-way communication and a 
product of long evolution, polished by centuries. It can be seen as an intennediate step 
between Language 1 and Language 2. 


Here is another example: 

"A n nacKOBO cohu,o, tbi Bjia^HMep- 

KH33b! 

He naaa MHe tboa 30Ji0Ta Ka3Ha, 

He Hana Tpn CTa »:epe6n,OB 
H He nana Morynna Sorarbipn, 

A h TOJibKO noacajiyh oahobo mhc 
MOJ ionu,a, 

Kaic 6bi Monona EKHMa HBanoBima, 
KoTOpOH cnyacHT AnemKH IloiiOBMMy". 


You are indeed a gentle sun, prince 
Vladimir! 

I don’t need your golden treasure, 

I don’t need three hundred stallions, 
I don’t need mighty heroes, 

Only give me one lad, 

Like the young Ekim Ivanovich 
Who works for Alesha Popovich. 


In the samples of old Russian poetry, the word KOTOpOH (kotoroy) is the subject 
pronoun who with a male gender ending. It can be found in the collection of Kirsha 
Danilov, but it is rare: 18 per 8000 words (0.002). In an at random selected segment of 
Leo Tolstoy’s Childhood, the frequency of kotor- plus case and gender ending is 27 per 
2512 words (0.01). Tolstoy, however, uses all other available subject pronouns and high 
level of recursion (still negligible as compared with the American Tax Code). 

The two last lines of the above example could be put differently, in the tradition of “pre- 
haplology:” 

Kaic 6bi MOJiOAa EKHMa MBaiiOBima, Like the young Ekim Ivanovich, 

Ekhm IlBaiiOBHMa npn Ajicuikc Ekim Ivanovich with/at Alesha Popovich. 

nonoBHne." 

Apparently, there was a lost in time person who applied haplology and invented the 
pronoun out of something available for another purpose. 


8 


My point is that the Nean grammar, in all its natural primitiveness, opens a way to further 
gradual evolution toward modern recursive language, which we, however, try to avoid 
in everyday communication. The dense recursivity is intended for the written language, 
which does not impose a heavy burden on memory. The written text keeps a large chu nk 
of narrative fully in view. 

Regarding Piraha, the controversy comes not from its peculiarity but from the theory that 
was built initially on Language 2 but tried on Language 1 or 1.5. 

I believe that the painful Frankensteinian story of formal linguistics, re-tailored, re-cut, 
and re-patched many times over (but so beneficial for scores of graduate and 
postgraduate students of humanities all over the world), would come to an end long ago 
if Noam Chomsky started with simple languages of pre-literate people instead of the non¬ 
existing infinite productivity of grammar under the pens of sophisticated writers. It looks 
like John Colapinto also noted that. 

My prediction is that with more time the Piraha, the immigrants from our past, who 
suspect the outside traders of cheating, will find a way to control their trade, probably, by 
letting their children leam math and Portuguese. In the immigrant families it is the 
children who serve as an interface between the old folks and new reality. 

Until then, it is just a tribe whose homeostasis has been miraculously preserved, as Dan 
Everett’s remark in his main paper testifies: “The Piraha are some of the brightest, 
pleasantest, most fun-loving people that I know.” 

If Russian is a language, so is Piraha. But probably not for the reasons of the formal 
linguistics. 

As for the formal linguistics, a similar story happened with chemistry: it was prevented 
for a long time from the true understanding of nature by the dogmatic theory of 
phlogiston, the hypothetical substance of flammability with zero or negative weight: 
sulfur bums and disappears because it consists of phlogiston. During the entire eighteenth 



9 


century large number of contradicting treatises had been written in order to make some 
sense of the theory—quite like with universal grammar—before Antoin Lavoisier 
kn ocked it off like a dead cockroach. He then lost his head in the French Revolution, but 
this is a different story. 

Quite unexpectedly, the Piraha story has made me appreciate the historic significance of 
Noam Chomsky as a public figure on the world scale. He has been keeping linguistics in 
the focus of public curiosity and attention. As result, the very linguistic trade today is 
alive, growing, and hot. Naturally, mosquitoes from the young swarm have come to 
bother the master and grow through the ranks on his blood. Similarly, as a stinging gadfly 
himself, he has been driving the pachydermous American foreign policy into the focus of 
world attention, although the animal was quite capable to stay there on its own. Whether 
he is right or wrong is of secondary importance as compared with the importance of his 
dogged search for truth. Can a really intelligent person be completely wrong? 

John Colapinto got his story right. In politics follow the money, in private life cherchez 
la femme, and in linguistics listen to the horse’s mouth. 


SELECTED REFERENCES: 

Dan Everett’s main paper: http://www.stanford.edu/class/symbsvslQQ/everett.pdf 
or http://ldc.upenn.edu/mvkllog/EverettPiraha.pdf 

The detailed paper of Andrew Nevins, David Pesetsky, and Cilene Rodrigues, which criticizes the 
conclusions of Dan Everett, can be accessed through http://ling.auf.neflingBuzz/000411 (click on the title) 
or directly downloaded through http://ling.auf.net/lingbuzz/@uESGsWZHRSsRjdIx 

A sample of Piraha text: www.llc.ilstu.edu/dlevere/docs/panther.pdf 

Peter Gordon’s article in Science is available online through EBSCOhost Research Databases: 

Numerical Cognition Without Words: Evidence from Amazonia., By: Gordon, Peter, Science, 00368075, 
10/15/2004, Vol. 306, Issue 5695 , pp. 496-499 Database: Academic Search Premier. Of course, you 
can drive to a big library, if you have nothing better to do. 


April 14, 2007 


email 









1 


Yuri Tamopolsky 

THE RUSTY BOLTS OF COMPLEXITY 

IDEOGRAMS FOR EVOLVING COMPLEX SYSTEMS 

ABSTRACT 

This essay continues a series of discussions (http://spirospero.net/complexity.htm) 
in which a chemist joins the debates about the large and complex world outside 
the narrow professional setting. It aims at understanding evolving complex 
systems (ECS- or X-systems: life, mind, society, culture, economy, language, 
ideology, etc.). Understanding requires transfer of information in a language 
shared with the receptor. The language must include terms for most abstract 
patterns of action and change regardless of particularities of the system. An 
attempt to create highly abstract ideograms for such patterns was made by Rene 
Thom. The problem is discussed here from the point of view of transition state in 
generalized chemistry. The focus of interest is the origin of the stubborn growth 
of complexity typical for X-systems. An aggregation through multiple weak 
bonds is a possible general pattern of complexification. The alternative to 
algorithmic complexity pattern complexity applies not to configurations but to 
their harboring configuration space. Pattern complexity grows when the space 
expands. Examples and toy models (ditalini pasta, “brushmobile,” rusty bolt, 
zipper, “wealth pump,” bow tie, and others) are used as illustrations. Pattern 
understanding of history is discussed. 

KEYWORDS: Pattern Theory, Ulf Grenander, Rene Thom, Hannah Arendt, Gregory Chaitin, 
Stephen Wolfram, Leonard Talmy, complexity, transition state, evolving complex systems, theory 
of history, cliology, understanding, X-systems, stability, lability, generalized chemistry, multiple 
weak bonds, origin of life, origin of society, origin of life, aggregation, cognitive semantics, 
ideogram, iconic graph, metaphor, bow-tie structure. 



2 



“There are flows taking place within society and culture.” ... 

“What seem to flow from the wider culture to the scientific activity of an 
individual, and back, are more easily described as ideas, representations, 
or metaphors .” 

David Aubin [9A, p.6 and 7] 


Introduction 

In this essay, as free-wheeling as all the others [1], I would like to offer a taste of a 
possible conceptual base for understanding evolving complex systems (X-systems: X is 
short for ECS) from the point of view of a chemist. 1 stop short of SECS, Spontaneously 
Evolving Complex Systems not out of prudishness, but because I distrust spontaneity. Examples are 
biosphere, life, mind, society, economy, technology, science, culture, art, media, politics, 
language, ideology, i.e., everything from molecules to ideas, or, in terms of outer limits, 
from biosphere to noosphere. Some concepts mentioned here are discussed in [1] in 
more detail. 

Watching the development of the entire area of complexity and complex systems 
for at least half a century, I see no sign of consensus and not even a hint that anything like 
formal theory is possible. Taking to account the number of brilliant minds highly 
successful in their professional areas and looking at a larger world outside the laboratory, 
I attribute the lack of a unified theory of X-systems to the fact that by theory everybody 
understands what it means in a particular professional domain. The best illustration is the 
rift between sciences and humanities. How can we count on any consensus, regarding, for 
example, human history, if physicists and historians—who know about history 




3 


incomparably more than physicists—do not have a common language not only between 
each other but also with biologists, psychologists, artists, philosophers, etc? Instead of 
computer games at one end of the scale, Grand Theories of Everything at the other, and 
quantum intelligence somewhere in the middle, we could do better by developing a 
common language for the modest purpose of understanding what Everything is, as well as 
what is is. 


This preliminary sketch, with an intent to expand on major topics in the future, 
ties in a loose knot the following strands. Although I am a chemist, my major influence 
for twenty-five years has been Pattern Theory of Ulf Grenander, which in my eyes is a 
generalized chemistry. I have been familiar with the ideas of Rene Thom, Stephen 
Wolfram, and Gregory Chaitin for as long. Most recently I have been impressed by 
some other discoveries, among which Hannah Arendt had the strongest impact. Having 


no formal theory of its own, 
Like computation, chemistry 
between theorizing and 

faber, man the maker, the 
chemical catalysis, the 

Those are the strands tied in a 



chemistry borrows it from physics, 
takes an intermediate position 
making. Hannah Arendt’s homo 
modern hero, clearly resonates with 
chemical maker of life on earth, 
knot in this essay. The knot is a 


metaphor and metaphor is one of the strands of the essay, too. The knot in the center of 
this paragraph is an ideogram, which escapes a single definitive explanation. It can be 
interpreted as: (1) marital knot, (2) movie plot in which two characters become 
inseparable, (3) coincidence of two policies on some points and divergence on others, (4) 
military quagmire, etc. The list of interpretations is always open and the prospects of the 
knot of being untied, cut, or fused into a permanent connection are uncertain. 


4 


PART ONE: CHATTING 

I am a chemist and the secrets of life are least of all mysterious to me. I am very much 
intrigued by mind and language, but most of all by understanding society and its future. 

My upfront goal is to illustrate how we can discuss cabbage and kings within the 
same vocabulary, so that they would look like twins. My secret goal is to infect the 
reader with the suspicion that chemistry is about human existence (human condition, as 
Hannah Arendt [2] would say) in the widest possible sense —oops! no more a secret. The 
world looks very peculiar through the eyes of a chemist, whose vision is as different from 
that of a physicist as the vision of a bee or a dog from ours: it is adapted to see through 
complexity. 

I am driven by a sheer intellectual curiosity because at my age I have little 
practical concern about the future. Yet I have also some personal reasons to be curious 
about the larger world. 

My interest in large complex systems comes from the time when I lived in the 
Soviet Union, trying to understand a social system that seemed completely unnatural and 
doomed. Although hardly anybody saw a quick end to it, the end came soon after I had 
left Russia. I went to America, swearing to purge my native land out of my mind— the 
vows of eternal hate are as futile as the vows of eternal love. 

About twenty years later, Russia, the country crucial for the fate of Europe and 
the West, seems to be taking a new historical turn, which I am no more interested in. 
Judging by daily news, the turn would qualify as a regression to Russia’s tsarist past. 

I am much more concerned about my new American homeland, which offered me 
freedom, shelter, and comfort. 



5 


I can see in the current presidency in America (2000-2008?) some shocking, 
however distant, similarities with what I saw in Russia. I suddenly begin to sense the 
historic fragility of the present, in spite of a however glorious past, facing the invisible 
and incomprehensible future, which today is as global and unmanageable as never before. 

Regarding history, I am a fatalist, which means that I believe that the events have 
a preferred direction. The ripe apple always falls down. This may seem like a physicist’s credo. 
But our understanding of the world is a component of the events. If I believed in 
equations, I would say that our understanding and expectations count in the equation of 
the future. But I do not believe in equations for X-systems. 

In his intelligent book In the Country of the Blind Michael Flynn [3] imagined a 
practice of cliology, i.e. the applied science of history—with equations—which 
developed into a true technology, mostly of cloak-and-dagger type, of designing and 
shaping the future. As every landmark idea, it is valuable for questions it poses rather 
than for answers. 

The technology of making history has been in fact evolving throughout history 
from the cloak and dagger to the suicide bombers and from the quill to the character 
assassination through TV ads. I do not share any illusions about possibility of a theory of 
history. The reason for my negativism was perfectly formulated by Hannah Arendt [2]: 
the historically meaningful events are those that violate statistics. I would also add: and 
equations. They always come as unexpected irregularities. Instead I have the following 
five point program to consider. 

1. Since we cannot experiment with history—we can and we do, but the 
experiment cannot be reproduced—cliology is extremely unlikely. Impossible is a forbidden 
word anywhere except mathematics. Infinity is forbidden everywhere else. What we can do is to 

understand history. 

2. We never know if somebody understands something unless we have some 
feedback. By understanding I mean the ability to convey a doctrine (an open system of 



6 


principles, explanations, and examples) to other people, for whom the ideas are new, in 
the old terms, so that they will prove their own understanding, not necessarily perfect, by 
passing it back to the source for the check, as well as further down the line, until it 
diverges or evolves into another doctrine. Understanding, therefore, requires a common 
language and involves a kind of a teacher-student relationship which converges to a 
consensus. Like a relationship between two gossipers frequently exchanging “if you know what 1 mean.” 
If understanding, according to Gregory Chaitin, means compression [4A], the proof of 
understanding means ability to compress and decompress in more than one way, like 
folding and unfolding a napkin in various ways. 

For comparison, here is Ulf Grenander’s note on understanding: 

We claim that we understand the pattern only if we can analyze it in terms of a given 
knowledge representation or, alternatively, if we can pinpoint in what respect it differs 
from what the representation prescribes. [5A, p. 100]. 


3. If we understand history up to its current moment, so that a nearest future 
brings a delayed test, we will be able to pass the understanding to posterity facing a more 
remote future than ourselves. It will be irrelevant by that time, however. In other words, 
our understanding of the world in the short run can be very different from that in the long 
run, but we never have the long enough run ahead. This contradiction is dramatically 
different from the post-Galilean belief in the timeless laws of nature on which all our 
activity in the role of homo faber is based. 

4. Suppose we can elaborate a set of principles that govern the preferable 
direction of historical events in the short as well as in the long run. By regarding 
“historical” as “natural,” we will come to a version of thennodynamics, which is a very 
small set of universal principles about natural and man-made events. Unfortunately, 
thennodynamics, timeless as any good science should be, is silent about the short run, 
while history is all about the time span of at least one or two generations. We would at 



7 


best know what could possibly happen, but not when. We would know a set of 
alternatives but not whether it was complete. 

All we can do with thermodynamics is to conclude that an earthquake in San 
Francisco or even the end of the world is imminent. What we need is kinetics, i.e., the 
knowledge of the relative speed of alternative events and how to manipulate it. Thus, in 
the presidential elections of 2004, in which the Republicans were much faster and nimbler than the 
Democrats—for which there were pervasive thermodynamic reasons—the outcome could be, theoretically, 
different if the campaign lasted just a couple months longer. 

For more about kinetics regarding abstract systems, see [1], 

The outcome of the Battle for Britain during the WWII was decided by the faster 
adaptability of Britain. The outcome of the competition for energy resources between the 
giant nations and unions (USA, EU, India, China) will depend on who adapts to the 
exhaustion of mineral fuel faster. The rate of the population growth, which seems 
turning toward negative [3B], is of the paramount importance for the political shape of 
the globe and a sustainable, however unexciting, economy. The chemical industry is 
possible because of the differences in the speed (the chemists call it rate) of the desired 
and the unwanted reactions, while the respectable wine industry is based on equilibrium 
thennodynamics and is in no big hurry: the longer the better. 

The world is not flat because kinetic barriers warp, wrinkle, and crumple it. 

5. If history always brings something new and unexpected, could our 
understanding of history be lasting? The property of novelty is intrinsically alien to any 
closed theory, but our language is quite agile at expressing new things in old terms, 
after which new terms are assigned to new things and immediately cease to be new. 
What we can theoretically predict is not new. By definition, the new cannot be predicted, 
only the different. End of the program. 



8 


Regarding the problem of realistic understanding X-systems, I have two related 
hypotheses. 

My first hypothesis is that if we can foresee the physical fate of the earth for 
centuries ahead—alas, there is a problem, too—we can try to project reasonably far ahead 
our open-minded understanding of history and its distinctive technology. More generally, 

understanding lasts until the larger system that houses the system under 
consideration changes dramatically, I hope that there are some limits to meaningful 
abstraction. 

My second hypothesis is that from a high enough level of abstraction, our 
understanding of an X-system will be simple. In practical terms, the number of 
alternatives for the future will be very limited. At least we would not care about the next 
president. Alas, this makes computation, the God’s manna for tenure seekers, useless, but this is not my 
problem. 


The two hypotheses can be combined in a statement that may seem paradoxical. 
In order to make predictions more accurate we need as less detail as possible. In 

other words, we should step back and look at the Earth as a whole from a distance: from 
what Hannah Arendt [2] called Archimedean point. We will not see much, but we will 
see what is large and lasting. In practical terms, we need to develop an abstract language 
to talk about Everything over the Grand Canyon dividing sciences and humanities. 
Neither formal mathematical systems nor physics can bridge it, to which a century of 
debate with no consensus testifies, because formal systems are not open to novelty. 
They are designed to be timeless and this is why they are disposable and can be out of 
favor if a new formal system (paradigm shift) looks more realistic. 

Chemistry, on the contrary, is an open system because it is not formal. Chemistry, 
like art, is a collection of old novelties, but it expands by the minute. Novelty is a 
recognized, however ephemeral, property of chemical compounds, see [1A, Chapter 1], 

Of course, chemistry, as everything else, looks simple only from the Archimedean point. 



9 


Paradoxically, biology has a theory based on chemistry, but chemistry does not 
have a theory in the common sense, except for physical theory. It possesses a very 
compact, but extremely successful, set of principles involving thermodynamics and 
kinetics, as well as efficient methods of handling the ferocious structural complexity. Of 
course, reducing it to simplicity, what else? Chemistry, however, has been silent on the subject 
of Everything and for a good reason: the proof of the chemical pudding is always in the 
making, not talking. Fortunately, Pattern Theory [5] (Ulf Grenander) is available as a 
mathematical generalization of chemistry applicable for Everything, from molecules to 
thoughts. 

We can now glue generators together and the bonds will tell us what combinations will 
hold together. This is a bit like chemistry: atoms (generators) are connected together into 
molecules (configurations) and the nature of the chemical bonds, ionic, covalent, and so 
on, decides what combinations of atoms will be stable enough to form molecules. [5A, 
P-83] 

This is why Pattern Theory gives to a renegade chemist like myself a podium for 
an oration, but words are often too heavy for a chemist to convey understanding. A 
chemist usually expresses his or her ideas in the graphic form—as ideograms—and this is 
what I am looking for as a medium for communication. Like music and visual arts, 
images appeal to universal human nature. This is why the advent of TV was a real revolution in 
propaganda: the visual perception outruns thinking. The velvet coercion of TV is more powerful than 
prisons and concentration camps. 

My intent, driven by my belief in the unity of the world, is to contribute to the 
development of a distinct language to talk about X-systems in the most general, but 
informal way, equally accessible to physicists, poets, chemists, and historians, as long as 
they are interested in tall problems that cast no long monetary shadows. Another metaphor. 

Concerning shadows, I believe that Ulf Grenander, as well as, on a smaller scale, 
Rene Thom, Stephen Wolfram, and Gregory Chaitin (himself brilliantly referring to 
Leibniz) cast the longest intellectual shadows onto the slowly developing understanding 
of X-systems as open systems. 



10 


I use the term open not in the classical physical sense of exchange of matter, 
energy, and information with the environment—they are open in that sense, too—but in 
the sense of inherent incompleteness due to the phenomenon of novelty. X-systems are 
open to all of the above but also to a future that cannot be derived from the present—as 
Stephen Wolfram’s deep conviction and models testify [8]. If so, can we speak about 
theory? Not in the contemporary sense (here comes future, which is always before the 
date) and this is why it is a different kind of science. But we can still understand such 
systems. I see the “theory” of organic chemistry as a typical understanding tool, as open 
to future as a Lego set. 

Whatever the new understanding might be, the grandiloquence of some pioneers 
should be forgiven until the course of history vindicates them and the steamroller of habit 
levels the highway asphalt over the former trail. 

I hope to expand elsewhere on how computers reminded the timeless mathematics 
about the ticking clocks. Until then, here is a quotation (secondhand) from Rene Thom 
[7B]: 


We cannot consider catastrophe theory as a scientific theory in the usual sense of the 
term. . . . [W]e must consider it as a language, a method, which allows to classify, to 
systematize empirical data, and which provides these phenomena with the beginning of 
an explanation that makes them intelligible. 

Earlier I suggested [1A] the terms Aristotelian (A) and Heraclitean (H) for the 
two kinds of systems and, perhaps, sciences. A-systems are the subject of science as we 
know it and H-systems are waiting if not for a new science then for some liberalization 
of the existing science. 

To understand means to tell somebody who can tell it to somebody else. 
Obviously, what we must have for this is common language. An example of such 
common language in the larger world is the universal language of nucleotides by which 
the future generations of organisms understand their predecessors and send a message 
to the posterity. 

The previous paragraph is itself an example of the language of metaphors, 
analogies, and similarities, shun by most so-called serious scientists as metaphysics but 



11 


legitimate in the pursuit of understanding. Hannah Arendt’s The Human Condition was 
written in this language on the other side of the great divide between sciences and 
humanities, but the divide is also open to the future— a strangely sounding phrase meaning that 
it could be closed. How? By an earthquake? Hannah Arendt was listening to the voices across 
the divide, among which the voice of Werner Heisenberg [6] was the clearest, operatically 
schooled in the widest acoustic range of Goethean traditions. 

The new ideas are not going to undermine “real” science in any way. But they are 
still met with mostly indifference and rarely hostility (as in case of Stephen Wolfram; but many 
negative reviews still sound more like praise) because they are different. Rene Thom 
acknowledged a formal failure of his work. He came to a conclusion that the theory, met 
initially with enormous enthusiasm, had failed because nothing could be calculated from 
it. 

For as soon as it became clear that the theory did not permit quantitative prediction, all 

good minds ... decided it was of no value. [7C] 

In other words, it did not pay in terms of grants and tenure. It turns out that Thom 
regarded his theory as a “new kind of science:” 

This would not be the old science, but a new one, which would endeavor to provide 

explanations rather than mere description or predictions. [7D] 

The problem of hostility toward “a new kind of science,” echoes, in a fascinating 
manner, another modern phenomenon: religious zeal against Darwinism. This is an 
example of a social pattern, which we should keep in mind. Whether this is an X-system 
pattern, is an intriguing question. Let us keep it in mind until we discuss the “rusty bolt” 
ideogram. 

Speaking of money, there is another beautiful—and exact—analogy, promoted by 
biochemists, between money in society and ATP in living cells as universal currency of 
energy—the analogy (search the Web for “money ATP adenosine”). This is definitely an 
X-system ideogram. 



12 


Another example of what I call ideogram is the bow-tie structure [11]. It is not 
actually a structure in the common sense but a very abstract pattern of organization 
consisting of a large number of inputs being converted into a small number of 
intermediates further converted into a large number of outputs. In other words, 
“complexity -> simplicity -> complexity.” Two examples follow. 



Nutrients 

Genes 

Sources [11A, B] 


Metabolism 

Expression 


Biomolecules 

Proteins 



Organization of the Web. 

The sites in the core are all 
interconnected with li nk s. Sources 
[11C, D] 


From the pattern and chemical viewpoint, the bow-tie ideogram means a 
decomposition of configurations into generators and a recoding and/or recombination of 
them into a new array of configurations. The combinatorial configuration space is always 
larger than the generator space. Further examples are, behavior, government and 
management, language and information processing, etc. There is something hidden in the 
bow tie ideogram, however: a complex system of rules or algorithms governing the 
processing. It is an open question, however, whether the set of rules is also simple. I 













13 


believe it is and I tried to illustrate it on the example of language. Genetic code is 
certainly simple. Another intriguing question is the optimal complexity of government. 
Excesses toward both simplicity and complexity of government can be dangerous, as 
American political history keeps illustrating with fresh examples. 

ATP, rust, and even money are in the legitimate sphere of interest of chemists, 
who, however, are conspicuously absent from most discussions on X-systems beyond 
chemistry. There is very little on chemical complexity (for example: [9C]) and no 
consensus on measures of structural complexity, although algorithmic complexity seems 
to have passed the time test. The words chemistry and catalyst, however, belong to the 
common vocabulary from the street corner to political debate because they embody 
powerful analogies. On catalysis, see [1A-E, K, L]. 

Chemistry has a unique attitude toward time. It is concerned with the intimate 
and delicate mechanism of change and is not satisfied with the question: “what is going 
to happen?” because there are typically multiple outcomes. Anything we can imagine in 
the combinatorial chemical space can happen. The main question of chemistry is: “what 
is going to happen faster?” 

Chemistry shares its approach to time with other sciences about X-systems, for 
example, with history (the French Annales School): it emphasizes the distinction between 
the long run (long duree) and the short run. There is an intriguing anti-symmetry relation 
between chemistry and history, however. Chemistry states that what we know about the 
long run—and we know a lot—does not tell us about what is going to happen in the short 
run, which is a matter of educated guess, best of all substituted by a simple experiment. 
History states that what we know about the short run—the period of the Clinton 
prosperity and the technology bubble, for example—could not tell us a bit about the next 
(current) radical turn in American history, which cannot be analyzed until long 
afterwards. The only safe prediction was that both would come to an end. 

The next—third—turn could bring even more division, shame, and death, as well 
as profits, or it could heal some wounds and repair the battered roofing and flooded with 



14 


mud foundation of American system. Can we see it ahead of time? It may seem that the 
politics, unlike the vagaries of the market, is a game within a limited circle of players 
and statistics does not apply to it. But if the masses of voters are involved, shouldn’t be 
there some thennodynamics in it? On the other hand, with the growing power of hedge 
funds and big mutual funds, the market might cease to be anything but a televised poker 
party. Since the masses follow the leaders, what politics can be the fastest? Such 
dilemmas constitute a typical problem of understanding history in the making. If we 
understand history, we probably will be able to look a little behind the time horizon. Note 
that this is a typical example of the need to express the new in terms of the old, as 
understanding should be. 

Our views of society are highly diverging and polarized. Experiments with society 
are mostly unintended, sloppy, or rigged. While technology in experimental science 
brings us in a close contact with factual reality, technology in politics and business puts a 
TV screen between US and the factual truth. The modern presidential debate is a classical case of a 
rigged experiment. Can we have any degree of consensus in understanding “what we are 
doing,” the central question of Hannah Arendt’s “The Human Condition?’’ 

If the primary question about an X-system is how stable it is, history is scientific 
enough to tell us that no X-system is stable forever. If the question is about the next stage, 
history is not scientific enough to tell us that, either. If the question is how soon the 
current state is going to become unstable and collapse we are able to evaluate that, but 
our evaluation immediately rises the next question: how soon the current state is going to 
become unstable and collapse after we have evaluated its stability and found it low. 

Chemistry always asks: “how soon?” What happens faster is the most likely 
future. This is similar to the process of thinking, when out of a combinatorial immensity 
of all possible thoughts only one enters the consciousness, see [5C]. The abstractness of 
this similarity is exactly what makes it relevant for understanding X-systems. It illustrates 
the main problem with the rusty bolts (we will come to them later) of understanding “new 
kinds of science:” In order to accept a new view of X-systems we have to abandon the 
old paradigm that provides us with daily sustenance. 



15 


At a closer look, the problem is imaginary. The ambiguity of the term “new” is 
responsible for some rusty resistance to a “new science.” A new refrigerator means that 
the old one is discarded. But it also could be a second refrigerator for food storage in the 
basement. I am inclined to leave only three necessary attributes to the notion of science: 
observation, understanding, and consensus. Thus, I do not expect the idea that the 
universe is a computer to come close neither to consensus nor to observation. 

I believe that that the science of X-systems, when it takes its shape, will be not the 
new and only science but a second—complementary—science. It will be based not on 
closed axiomatic timeless mathematical or logical systems, but on open combinatorial 
systems of simple local interactions with laws and rules of limited rigor and validity and 
with yet unknown but definitely limited predictive power. 

Not surprisingly, the enormous factual material about X-systems has been 
accumulated mostly in humanities. Science I (physics) and “Science” II (understanding) 
are the overlapping components of the human picture of the world, to which humanities 
contribute their intuition and flare. 

Today complex systems are studied by Science I in tenns of dynamical systems— 
an area as logically closed as any physical theory. The current interest in the less 
hennetic data-driven dynamical systems testifies to the drive toward realism, but such 
systems do not break out of the enclosure of formalism. Nevertheless, modern science of 
complexity is more conscious of the long and short run, as well as of the large and small 
cognitive grain size. 

In the later three above-mentioned application fields [meteorology, chemistry, and 
physics] the data is typically sampled from a complex biological, chemical or physical 
dynamical system, in which there is an inherent notion of time. Many of these systems 
involve multiple time and length scales, and in many interesting cases, there is a 
separation of time scales, that is, there are only a few "slow" time scales at which the 
system performs conformational changes from one meta-stable state to another, with 
many additional fast time scales at which the system performs local fluctuations around 
these meta-stable states. 



16 


From: Boaz Nadler, Stephane Lafon, Ronald R. Coifman, loannis G. Kevrekidis, 
Diffusion Maps, Spectral Clustering and the Reaction Coordinates of Dynamical 
Systems (2004). [Meta-stable state needs a light push to descend to a stable state]. 
http://www.math.yale.edu/~sl349/publications/reaction coordinate.pdf 

It seems to me that Science I is moving toward complex theoretical solutions of 
complex problem (complexity square), instead of understanding “what we are doing” 
while creating the problems. Complex problems may have simple reasons, but not simple 
solutions. On simple reasons, see Essay 28. Simple reasons in [1M]. 

It is worth noticing that computation—the greatest most recent novelty of 
science—runs in intrinsic and measurable time and size (equivalent of space), unlike the 
classical mathematics, but very much like a chemical reaction. A unit of time is an 
elementary act of computation: a single beat of the processor clock. 

Computer is a great embodiment of Leibniz’ idea that time is the sequence of 
events. When the computer is paused, its time stops dead. The unnatural property of 
computer is that it can be revived from its clinical death by the magic wand of a slight 
touch. 


The passion of Gregory Chaitin for computation and the simple local rules of 
Stephen Wolfram, regardless of mathematics and philosophy, resonate sympathetically in 
my chemical heart. I was brought up in the openness of chemistry that weaves 
complexity out of 100 kinds of blocks of atomic Lego in the same manner art is weaving 
its complexity out of couple dozens of the letters of alphabet and maybe couple hundred 
color spots. 

The recent Meta Math! by Gregory Chaitin [4A], in my chemical interpretation, 
carries a clear and universal message of mistrust in ah closed logical systems, on which, 
as I see it, physics, but not chemistry, is grounded. As a chemist I see what looks to me as 
an underlying deep reason why perfection is imperfect: the world is in change, while 
abstract Aristotelian reasoning, like equilibrium thermodynamics, is heavenly beautiful 



17 


but never lasting. Physics keeps pace by breaking old paradigms—as a glass at a Jewish 
wedding—and espousing new ones. We all do that at new turns of history and personal 
experience. Regarding history, however, we do not even have a consensus-sanctified 
paradigm to smash. 

Alas, what mathematics still lacks is the measure of energy (or stability, probably, 
related to mathematical elegance) of a mathematical expression, but human languages 
clearly rank by energy required to use them. English and spoken Chinese are easy to 
handle, Gennan is heavier, Russian is very heavy, and Navajo is the hardest of all 
languages I am aware of. 

In [ 1J] I casually suggested that Navajo verbs were in essence ideograms, like in 
Chinese writing, only not graphic, but acoustic ones. The acoustic analog of a pictogram 
would be a sonogram (not in the usual sense; phonogram is also taken): an imitation of a 
natural sound, which is regarded as a source for language origin, according to some 
hypotheses. I suggest a further (tentative) generalization of ideogram: a symbol for 
representing an abstract idea as an entity of a very coarse grain, leading to 
simplicity. 

I have not escaped a metaphor in the previous sentence. Metaphor is the closest 
notion to ideogram. The difference is that metaphor names an object with a name of a 
different object, while ideogram names an abstract property of many objects with a single 
name and does it with a non-verbal symbol of its own. This, however, smells of a 
definition. I prefer to rely on demonstration, as, probably, Neanderthals did. 

Concerning names, I see in Gregory Chaitin’s description of understanding as 
compressing [4A] an echo of Henri Poincare’s casual remark that mathematics calls 
many things with one name. Ideogram is, therefore, akin to a mathematical symbol. 

Complexity is a very tricky subject. The problem I have with complexity of finite 
sequences of zeros and ones, as well as with the finite output of a cellular automaton is 
that each of them is a closed object—if we leave infinity to mathematicians and 
theologians to wrestle with. I am uncomfortable with ascribing randomness (and even 
regularity, which is a much better term) to single objects and using it for a complexity 



18 


measure. My discomfort only deepens when I read in Chaitin’s lecture [AIT = 
Algorithmic Information Theory]: 

In these lectures I discuss philosophical applications of AIT, not practical applications. 
Indeed, I believe AIT has no practical applications. 

The most interesting thing about AIT is that you can almost never determine the 
complexity of anything. This makes the theory useless for practical applications, but 
fascinating from a philosophical point of view, because it shows that there are limits to 
knowledge. [4B] 

This is as much overboard as anybody can go. 

An alternative approach to complexity is possible. I believe that the property of 
complexity applies to open combinatorial spaces, such as configuration space of Pattern 
Theory, or molecular structures in general, or an individual language as a whole, and not 
to individual configurations in those spaces. My thesis, in tenns of Pattern Theory, is that 
all individual configurations (structures) are equally simple or complex within the same 
configuration space defined by the generator space and regularity. To be concrete, all 
chemical structures, from water to DNA and proteins, have equal pattern complexity 
defined by the size of the Periodic Table and properties of atoms. 

For the first awkward take on this difficult to accept—and formulate—issue, see 
The New and the Different [IK]. From this standpoint, all sequences of 1 and 0 have 
equal pattern complexity, I would say, about 2. It jumps to ~3, however, if we add 
number 2. As a consequence, while all chemical structures are equally complex (or 
simple), the history of chemistry is a record of growing pattern complexity, from the four 
elements of ancient Greeks to the first periodic systems and to its subsequent expansion. 
This applies also to larger structural blocks, such as carbohydrates, proteins, steroids, and 
aromatic cycles. 

In addition to the above idea, I suggest setting aside any absolute measure of 
complexity. What we can evaluate is a difference of complexities within a larger space. 
Moreover, the difference matters mostly in terms of more or less. More exactly, I am 
talking about a partially ordered set (scale) of complexity values. (See Essay 13, On 



19 


Numbers , [1M] for more about partially ordered scales, for example, in Confucian ethics; 
in short, we cannot measure somebody’s virtue, but can compare two virtuous persons). 

The wide use of differences instead of absolute values is typical for traditional 
pre-computer chemistry. Differences, unlike absolute values, are always accessible and 
practical. Will you be more or less happy if you divorce XYZ and marry ZYX or buy UVW shoes? 

To prefer relative values to absolute ones means admitting that algorithmic 
complexity must somehow include the knowledge about how the author of the algorithm 
understands the system and what is the size of the grain in that understanding. Thus, we 
can compare various computer outputs, but we cannot compare a computer output and a 
protein unless in a larger abstract system which would somehow include some 
knowledge of computers, as well as of chemistry, and, more dramatically, the knowledge 
of human nature. Nevertheless, this is exactly what I advocate in this essay: we need not the largest, but 
just a larger system. 

I understand incompleteness as the inherent lack of a larger—in time, as well as in 
space—system to describe the system under consideration. To put it simply, in order to 
create a scientific model of eight years of American history (2000-2008), we need to 
collect the historical record for at least a century (1950-2050; we never know in advance 
for how long). We can monitor the expansion of the conceptual space by comparing the 
expansion of the vocabulary in use. This is why I am pessimistic about any “dynamical 
model” theory regarding X-systems. There is an additional reason for that: “dynamical 
model” requires (if I am not mistaken) a large enough statistical ensemble. Individual 
human mind, however, is incapable of operating with large number of cognitive entities. 
Nor, for that matter, are we capable of directly handling a large number of things and 
people. 

We always maintain a certain density of information by well known hierarchical 
aggregation. Thus, contrary to popular opinion, the chemists who deal with really 
complex molecular structures do not necessarily think always in terms of atoms—they 
do not even write all of them in their ideographic formulas—but operate with larger 
blocks of stable fragments of structure. As everybody, they decrease the complexity of 
configuration space until it is manageable by human mind. The politicians, spitting out 
frozen cliches, slogans, and sound bites, use the same trick for managing the minds of the 



20 


public who has enough complexity to deal with in private business and family life. Not 
that there is anything wrong with that, except that the democratic elections become irrational. Love can be 
irrational, but an irrational vote could be literally suicidal or even murderous, if it is the vote of a court jury. 

There is a curious document, The Use of Complexity Science. A Survey of Federal 
Departments and Agencies, Private Foundations, Universities, and Independent 
Education and Research Centers. October 2003. A Report to the U.S. Department of 
Education ( http://www.complexsvs.org/pdf/ComplexityScienceSurvev.pdf ). The title 
speaks for itself. 1 see it as money well earned by riding a wave. Complexity must be practical. 

Complexity is practical. As an example of the kinetic consequences of 
complexity, I regard evolution of language as simplification of grammar in order to 
increase the speed of communication and keep it up with the speed of events [1C], as 
well as with their complexity. By the same token, 1 anticipate the political leadership of the USA to 
be more and more superficial in the future, religion and obscurantism more and more attractive as a refuge 
from complexity, and the voter’s behavior more and more capricious and self-damaging—until a new phase 
transition in the mechanisms of civilization. 

I conclude PART ONE with another personal note. While reading Chaitin’s 
exuberant book, I recalled my own excitement when discovering—long ago—Leibniz’ 
ideas about time and space, the concept of meta-mathematics, the first publications of 
Stephen Wolfram on cellular automata, the idea of algorithmic complexity 
(“Kolmogorov-Chaitin” complexity, although Kolmogorov’s contribution seems only casual), Ulf 
Grenander’s books, and having recently discovered Hannah Arendt. Witnessing the birth 
of the modern chemical paradigm of transition state was also an unforgettable experience. 

Blessed be ignorance that gives us the sharp joys of discovery. 

I count on some tolerance of my going overboard with Chemistry of Everything. 
Well, this is just “Words, words, mere words...” 

We are moving from chatting to making: cooking pasta, which is a real thing, and 
assembling a brushmobile, which is even more real than one would thi nk . 



21 


PART TWO: MAKING 



Figure 1. Pasta: A, Dry ditalini; B, Cooked ditalini in water; C 
and D, Drained and slightly shaken pasta; E, Homo Faber was 
here; F, Cooked linguini. 















22 


To take photos for Figures 1A to E and 2A, I cooked some ditalini pasta. The 
diameter of ditalini is approximately equal to the length, which makes the circumference 
more than three times longer than the length. The cylinders have substantial flat ring 
areas at the ends. 

The dry cylinders have only two axial orientations on a firm flat surface: vertical 
and horizontal. The cooked pasta, still in the water(Figure IB), has more degrees of 
freedom at low effective weight than on a hard surface. But how many? To answer this 
question, we should collect statistics of the orientation and plot the distribution. 
Intuitively, I do not expect it to be random, unless in zero gravity, but I am not going to 
check it. The data will be true for as long as ditalini is on sale, cooking is done in the 
present manner and utensils, and the cooks still exist. I am interested not in what the 
cooked pasta tells my about cooking pasta, but about the world outside: the X-system in 
which pasta emerged, ditalini was invented, and which will probably exist after ditalini is 
extinct. I am interested in what the short run tells me about the long run, because all I 
can see is all of the short run and the past part of the long rune, sometimes fragmentary. 

The still photos do not convey any idea of time. Nevertheless, I am going to 
interpret the difference between the photos in Figure 1 in terms of stability and lability 
which imply the notion of time. 

Stability, points to the preferred direction of events in the long run: toward higher 
stability. Like energy (which it essentially is), it depends on the intensity of chaos, i.e., 
abstract temperature, in the system. A frozen abstract system at zero chaos does not 
change. I should be more precise: an abstractly frozen abstract system at abstract zero of abstract chaos 
does not change, which means that the Leibniz’ abstract time stops. Nothing is stable in a very hot 
system even in the short run. Of course, long and short time are also relative notions, but 
all we need is some distinction between them. For example, an average human lifespan is 
long as compared with an hour, although both are short on a cosmic scale. 

Physics abstracts from human presence and ignores the chance that a human hand 
will catch the falling Newtonian apple in mid-flight. History does not argue with physics 
but stretches human presence over particles and waves. Chemistry and biology take 
intennediate positions. 



23 


With my aversion to definitions and absolute values, all I can do is to distinguish 
between more stable and less stable configurations or states. While configurations are 
abstractions, states relate to a particular system in which not all configurations and their 
sequences in time are possible. Ulf Grenander reflects this subtle distinction by using the 
tenn image for an actually observed “realistic configuration,” not yet processed into the 
more idealized configuration. 

In other words, a configuration is a mathematical abstraction, which 
typically cannot be observed directly, but the image can. [5A, p.91] 

Photos in Figure 1 are exemplary images in this sense. Images cannot be 
interpreted as configurations without some arbitrary participation of the mind or a man¬ 
made program. It is needed for choosing generators and rules of identification. Pattern 
Theory acknowledges the human presence in mathematics: “we introduce the equivalence 
relation,”[5A, p.91] we have to decide whether two images are similar. 

A third principle deals with observability: given two combinations, when do they appear 
identical to an ideal observer (with perfect instrumentation)? [5B, p.3]. 

With some relative measure of time we can detect the differences in stability 
between two systems and the difference in the temperature for the same system. We do 
not need any absolute numbers for that, but we need a human observer of the images. 
Recalling Heisenberg’s prophecy [6], 

When we speak of the picture of nature in the exact science of our age, we do not mean a 
picture of nature so much as a picture of our relationships with nature. [6, p. 29]. 

We can see the inherent difficulty with all formal systems as soon as we ask the 
question “Are we, humans, nature?” Or: “Are we the undercounted part of algorithmic 
complexity?” We are present in any description of nature because we choose the size 
of the grain in the hierarchical generator space. We have little choice only in the case 
of computation, where the universal grain is 0 and 1. I am not sure this is so, however, 



24 


because we have to code the non-numerical notions, like symbols +, =, and f (function), 
but my expertise in mathematics is closer to 0 than to 1. Besides, infinity—incompatible 
with anything human—creeps in, if not in computation, than in our chatting about it. 

The difference between configurations and images comes to the foreground when 
we deal with human fantasy, creativity, invention, and planning, i.e. imagination, of 
course. I wish to draw attention to the ability of Pattern Theory to speak about both 
nature and mind in a unified language, or, in Heisenberg’s words, about our relationship 
with nature. 

When by nature we mean X-systems, we should get used to do without anything 
perfect, exact, definite, and precise, as we already do in calculating the behavior of 
complex inanimate systems like bridges, airplanes, and weather. They all can let us down. 

Back to pasta. The cooked cylinders form relatively stable coaxial assemblies 
several pieces long (oligomers, as a chemist would say) because their bases can stick 
together along their flat cross-cut rings. (Figure 2B). The significant, almost biological 
regularity comes as no surprise because pasta is man-made. Similar tendency to form 
organized structures is seen in linguini forming strands of parallel filaments, Figure IF. 

It is also used in liquid crystal electronic displays. 



Figure 2. Pasta bonding: A, no bonding; B, at the base 
surface; C, along the generatrix line; D, at a point. 

Other types of connection, such as Figure 2C and 2D, may occur as dimers but 
hardly as polymers. The area of contact along the generatrix of the cylinder is much 
smaller than that of the basal ring and, therefore, the bond of 2C type is weaker that that 
of 2B . The 2D type connection is even weaker. 




25 


For elongated particles the parallel bonding is more stable than the consecutive 
one, compare with Figure IF. 

The green rectangles encase single linear strings and the red ones emphasize a 
weaker parallel bonding. The blue frame in Figure 2E displays an unusually high 
degree of regularity in two dimensions, which reminds a raft structure. This image, 
depending on the angle of view, looks either artificial or, just the opposite, as natural as 
a biological fonnation. No wonder, because, philosophically speaking, it is both: it is 
artificial because here I became homo faber and arranged pasta bit by bit by hand and not 
by shaking; it is naturally biological because I am alive and I have a mind of my own. 
This painful relativity of life and mind has been causing endless discussions about both, 
but on the platform of Pattern Theory I have as little problem with mind as chemists have 
with life. Which means: little but not nil, maybe even a lot. 

An interesting side dish of a problem comes here into view. How can we detect an 
artifice, for example, on another planet? The evidence of artificiality of 2E can be seen in 
the absence of the intermediate fonns between chaos and extreme order. Artificiality, so 
to speak, is anti-Gaussian and in this sense abnormal. The absence of intermediate forms in 
paleontology comes to mind as a sign of both natural and artificial origin of species. 
Artificiality here means that the species, probably, develop not (or not only) by a random 
genomic walk but in response to a new source of non-equilibrium order. 

The problem of elephant’s trunk—antagonizing biologists over Darwin and even Lamarck and 
Cuvier—remains a mystery. In biology it has the status of Fermat’s Last Theorem with a big difference: it 
is a problem about possibility and not about impossibility, as in mathematics. 

Now we are coming to the main treat. Once arranged, the regularity of 2E type 
has a significant stability because the multiple weak bonds along the generatrix keep the 
entire pasta “raft” together as if they were one strong bond. Only as if, because there is 
no such single bond and the aggregation can be explained in terms of overall stability. In 
other words, stability is additive: it is a sum (approximate) of increments. 



26 


This is the heart of chemistry: the global property of a structure can be derived from local 
components. It seems that in social chemistry the global property can be imposed from a local point (King 
Hammurabi or Joseph Stalin), but the star topology is reducible to individual bonds. To follow the law or 
not is our personal decision. We obey because otherwise our stability will be decreased. I believe there are 
physical theories based on strictly local interactions, but 1 am not an expert. Look for works of physicist 
Gerhard Mack on universal dynamics. 

Once formed, the raft disintegrates only at a higher chaos than that of its creation. 
In order to bring the raft to existence and make it stable, I had to perform work, i.e., a 
series of gentle movements against some weak forces in a particular order. Moreover, an 
idea of the final configuration in my mind preceded the actual work. The idea was neither 
an image in PT sense, nor a single configuration, but rather a pattern of a 2D arrangement 
similar to 2D natural images. The idea itself had developed in two stages. Initially I only 
wanted to make long strands of cylinders, but having made a couple of them I decided 
that to put them side by side would make more effect. The artificial raft is an ideogram 
of my idea of artificiality (imposed regularity). There could be a variety of graphic 
symbols for this idea and even a superficial browsing through literature on semantics and 
semiotics (see some examples at [10]) shows that everybody can invent his or her own 
symbols. Nevertheless, the symbolic systems could be arranged in order of their complexity. There are 
some standard Platonic components in them, like point, line, arrow, and circle. 

The bits of pasta are held together only at low generalized temperature. If shaken 
above a certain level, they can separate and reassemble. The intensity of shaking or 
stirring, therefore, is a metaphoric image of the temperature, which illustrates the idea 
of what chaos, temperature, and bond means in physics, chemistry, and beyond, including 
history. This image is generated by my mind—reverse-engineered—from a configuration 
of an idea and presented to the reader as a language construct. 

Temperature is an ideogram, but I cannot imagine a static visual symbol for it. It 
can be well rendered in dance, however. 

Thus, high social temperature, often called “political pressure,” (indeed, high 
temperature increases the pressure of the gas in closed vessel), or “turmoil” can split a 
coalition or weaken the bond between the leader and his party. 



27 


Note that I know nothing about the actual physical mechanism of interaction 
between the pasta cylinders. Nevertheless, I venture to propose a kind of understanding 
by choosing the language of stability: the bits of ditalini stay together because it is more 
stable than to stay apart. 

IMPORTANT: bonds can be positive, so that the bonded state is more stable 
than unbonded, and negative, when unbonded state is more stable than divided. In the 
second case, a force or work is needed to keep atomic units together, as it commonly 
happens in modern societies, especially, created, like Iraq, by external initiative or, like 
Communist China, by a revolution. In such cases the internal binding force can be quite 
brutal. 


Figure 1 leaves us in the dark about the origin of life, but it tells us something 
about the origin of complexity. First, it can grow gradually and irreversibly because of a 
variety of bonds that generators can fonn; second, the concepts of life and mind do not 
have sharp borders in the language of generalized chemistry, i.e., Pattern Theory. It might 
be possible to erase borders completely in a language of Pattern Theory enriched by 
adding the kinetic concepts to the thermodynamic ones. 

On the above foundation of pastology one can devote his entire life to elaborate 
study of the world of pasta. Far from that, I further intend to build just a little more on 
that foundation by emphasizing the kinetic aspect of generalized chemistry, although in 
vague, tong-tied, non-mathematical terms, which, as I heretically believe, are most 
appropriate for X-systems. In chemistry, business, and politics, as in cowboy movies, the 
quickest shooter wins ... in the short run. 

Lability and stability are two characteristics of molecules regarding their 
propensity for change. 



28 



Figure 3. Lability (A, B) and inertness (C, D): 

A, Ratchet and pawl; B, Zipper; C, Rusty bolt; D, Candle. 


Lability or reactivity (in chemistry), unlike stability, means that the change from 
one stable configuration to another, even less stable one, occurs pretty easily. The word 
comes from the Latin labilis, tending to slip. The politician that easily changes positions 
can be called labile. Bureaucracy is inert, difficult to move, which is the opposite of 
labile. When the resistance of bureaucracy is overcome in one particular case, the next 
will be as tough. The furniture is hard to slide on a carpet, much easier on wooden floor, 
and even more labile if low-friction pads are under the legs. 

The concepts of stability/instability and inertness/lability are confusing because 
they overlap in common use and even students of chemistry often make mistakes. 


Intuitively, both instability and inertness should have stability as the opposite, and they indeed 
have. Stability, however, as many widely used terms, has at least two meanings: immediate (short run, in a 
particular act) and a prolonged (long run, at unspecified circumstances). 


Their distinction can be clarified with pictures in Figure 4. 


1. Ratchet and pawl. The ratchet wheel, Figure 3A, easily turns only in one 
direction because the transition from one position to the next runs into obstacles of 
different strength, depending on the direction. This asymmetric system is labile 
counterclockwise and inert in the opposite direction. It is labile, however, because it 
takes only a little work to change its state. Otherwise it is sturdy enough and if not 
exactly reversible, then movable into any initial position. 


29 


The ratchet is, in my eyes, an ideogram for irreversibility by design, not by 
thennodynamics—a very important distinction. This kind of irreversibility has human or 
biological origin. It is life-made, if not man-made. 

There is significant recent literature on ratchet as a general principle [11], or 
ideogram, as I would say. 

2. Zipper, Figure 3B. Multiple weak bonds fonned by the teeth of the zipper 
form a strong bond between the two connected pieces of fabric. The connection is labile 
and symmetrical (reversible) because the bonds can be locked and broken one by one. 
This type of connection is typical for the biochemical structures of life: the coiled 
enzymes, structural proteins, and nucleic acids can be unfolded relatively easily by an 
excess of small molecules capable of fonning labile bonds with them. 

The typical chemical bonds that fonn the skeleton of a molecule are strong 
(covalent) bonds, difficult to break up. They are inert, but not necessarily stable. 

Now we can step aside and watch the zipper from the Archimedean point. We 
immediately note the master of the zipper, whoever he or she is. 

Why does the zipper open? Because the final state in the largest possible system, 
including a human, with bonds to and from the zipper is more stable than the initial state. 
The human master of the zipper feels relief or satisfaction after having opened it. The 
tension drops, but it might jump up next moments, if two humans are involved. The zipper stands as 
an ideogram for much larger and more complex X-systems, such as a conflict between 
two powers over a border incident (easily reversible in principle). 

3. Rusty bolt, Figure 3C is trickier than it seems. It is inert the first time you try 
to unscrew it, but right after that it can behave like the zipper, unless again abandoned to 
the elements. I choose it as ideogram for a rejection of new ideas by a majority. Note 
that the rusty bolt sits tight because of multiple weak (non-covalent) bonds. 

Some people display the rusty bolt pattern of behavior: reserved and cautious the 
first time but friendly and warm afterward. The same pattern can be seen in resistance to 
new ideas. “Screwed up” is another ideogram, effective in politics and business. 



30 


4. Candle, Figure 3D, is my favorite example. It is made of paraffin, a mixture of 
hydrocarbons capable of burning in contact with the oxygen in the air, which means that 
the stability of the candle-air system is low. Its lability, however, is also low, so that it 
can be stored indefinitely, unless ignited, while an iron nail reacts with the same oxygen 
and turns into rust in a moist atmosphere. 


A B 



C time 



D 



Burning 


Resurrection 


Figure 4. Transition barriers of: A, ratchet and pawl; B, zipper ; C, rusty 
bolt; D, burning candle (vulgarized!). 


White phosphorus, as an extreme example, spontaneously ignites in the air and 
bums brighter than the candle. The red phosphorus is inert and does not change if 





































31 


exposed to air. Both consist of exactly the same atoms, like Dr. Jekyll and Mr. Hyde, 
only arranged differently. 

High stability, therefore, means that the system is inclined to remain in this state 
no matter what and not to move to less stable states. High lability means that “no matter 
what” does not matter and the system will move if slightly prodded. 

The reason why, unlike the Phoenix bird, the candle cannot be resurrected is that 
the products of the burning are much more stable than the starting candle and oxygen. 

The physical-chemical picture behind stability and lability is rather complex and 
technical. As far as lability is concerned, most chemical transformations are reversible 
and a dynamic equilibrium establishes between all participants and products of 
transformation. Natural lability has no ratchet properties. The position of the equilibrium 
depends on the temperature. 

The principal reason for the irreversibility of burning is that the heat of the 
burning, as well as the gases, are dissipated and dispersed over a large volume. I don’t 
know about the Phoenix birds, but if a hundred sparrows are released from the cage, there 
is no chance they will come back without a team of skillful catchers. 

The candle as ideogram stands for an unstable inert system in a precarious 
environment. It is just a little bit stable that a gasoline can with a matchbox nearby. 
Today I would associate the intact candle with all nuclear weapons on earth and the 
burning candle with the process of burning mineral fuel, ignited by the Industrial 
Revolution. The intact candle is in the initial state of the process and the burning candle 
is in a long transfonnation that we can observe in all detail. In the short run, the mineral 
fuel will be exhausted, but in the long run—we do not know how long—there is a chance 
that plants and algae will do the job of catching the carbon sparrows, using the sunlight as 
the source of energy. This is exactly what they have been doing since life emerged on 
earth. The problem is that the long run of Industrial Revolution is a very short run in the 
history of biosphere. 



32 


The fact of crucial importance is that we are unable of predicting what will 
happen next. What we call a true historical prediction is a winner in a race between many 
runners. Of course, somebody with a good record has decent chances. From the human 
standpoint, pessimism is as justified as optimism. This is a typical example of the 
openness of X-system. The largest X-system on earth is always pregnant with novelty. 
All we can say is that some human societies will adapt better to the burning process than 
others. Collapse by Jared Diamond [3C] is, by the way, another example of the 
Archimedean point taken by an author in a non-technical language. 

Answering the call of Hannah Arendt to think about what we are doing, what 
comes to my mind first is burning the candle at both ends. 

I believe that with the above examples of some very general patterns of systems 
behavior I follow in the steps of Rene Thom’s, Structural Stability and Morphogenesis, 
[7A] see Figure 5. What he called archetypal morphologies, I would call ideograms in 
the sense some Chinese and Egyptian characters are. See [1J]. I regard ideograms as 
highly compressed, large-grain alternatives to dynamical systems. 

I have already used the term ideogram several times, but my description (no 
definitions!) “any symbol for representing abstract ideas” needs some illustrations for 
symbols other than acoustic, pictorial, or gestural, i.e. verbal. There is a problem, 
however. 

If you decide to venture into the jungles of X-systems, you must be prepared to 
get immediately lost in the thicket of literature that grows right before your eyes. As soon 
as you cut through a couple of publications, twenty new ones spring right from the 
ground or dangle from the branches. Moreover, you have to be prepared to venture into 
quite disconnected and distant areas of knowledge—as if you were required to speak all 
languages of the globe and have degrees in all sciences. One cannot do all that without a 
good deal of ignorance, hubris, and naivete. Even some respected authors, who use 
similar tenninology and ideas and live in the same period, seem not to notice each other’s 
presence. As a striking example, semantic topologist Leonard Talmy in his vast 
bibliography misses semantic topologist Rene Thom and scores of authors writing about 



33 


patterns do not even mention Pattern Theory. The excitement, however, is exactly in 
mapping them all in a compact single image. What helps is gradual triangulation: 
somebody can see two distant mountain peaks and put them side by side in a single frame. 
This is how the science of Everything is being built. Formal systems are picket fences 
around private lots. 

The literature on graphic representation (topology, more accurately) of meaning 
is large, growing, and with little consensus in sight. I include select references at [10]. I 
am planning to return to this subject and compare the ideas of cognitive semantics 
(Leonard Talmy) with Pattern Theory (Project Golem of Ulf Grenander [5C]). 

Ulf Grenander designates content and connector (i.e., structure ) as the two aspects of a thought 
configuration; Leonard Talmy calls content and structure (i.e., connector) the components of cognitive 
representation. 

Here I limit myself to Rene Thom as one of the founding fathers —and a martyr— of 
the subject. 

Thom’s idea, as I interpret it, was that a graph could represent an essential type of 
change in the state of a system, which he called a catastrophe. His graphs, however, are 
pictures and not graphs in mathematical sense. It comes as no surprise because a graph, 
i.e., simply speaking, a combination of points and connecting lines, is static by its very 
nature. 

Thom’s ideas and illustrations are often vague and his text is sometimes a banal 
chat. I think it is unfortunate that his work of cardinal importance remains unsettled and 
unfinished, although his ideas were used or emerged independently in linguistics. For 
more about ideograms and metaphors, see [1C, Chapter 6]. 

My primary exasperation with Thom is that I see nothing that a chemist would 
qualify as structure. Ironically, while writing about structural stability and morphogenesis, 
Thom never actually cared about structure and was indifferent to “material evidence” of 
making. 




Figure 5. Abstract patterns of change (archetypal morphologies), 
along Rene Thom. 


For real structure as the chemist understands it one should turn to Ulf Grenander. 





35 


Notably, only Thom’s fastening includes three parallel timelines. It would be 
logical to assign three timelines to all cases with donor, acceptor, and the migrant: giving, 
taking, and sending. 

Stirring is the only “one-line” pattern in his Rene Thom’s classification. This is 
where the closed system fails to open to the novelty of X-systems which never come to 
the same state. As a chemist, I would interpret Thom’s fastening as bonding, for example, 

which would require a reverse 
breakup. Adding the stability 
dimension, I would use a stirring 
transition hump in “Thomist” 
ideograms, see Figure 6, for 
destabilization A1 , stabilization 
A2, and their combinations in 
binding B, and splitting C. 


Ulf Grenander’s operations 
of pattern dynamics, ADD, 
REMOVE, MODIFY, etc., applied 
to generators and bond couples, conform to those of chemistry. They include formation 
and breakup (more generally, modification) of the bond and removal, insertion, and 
substitution (more generally, modification) of a generator. They all amount to a breakup 
and fonnation of a bond, but, unlike anything I am aware of, they accommodate the 
property of novelty which distinguishes the X-systems from the traditional closed ones. 

If Thom’s patterns are reduced to horizontal timelines of stability alternating with 
ascending, descending, or hump-like transitions, they will conform to the familiar 
chemical minimalism. In [1L, Fig. 21.8, p. 284] an irregular stability graph 





Figure 6. “Thomist” patterns of abstract 
change (A), binding (B) and splitting (C). 


is used to portray the period of French history starting with the 
French Revolution, the latter itself looking like the back of a stegosaurus 


[1L, Fig. 23.1, p. 323], 



36 


Coming back to pasta, its “chemical physics” is illustrated in Figure 7 . The 
vertical axis corresponds to increasing generalized energy and decreasing stability. 

Two pasta cylinders can be disconnected (A) and connected (C) and (D). If the 
system is “heated” by shaking, a certain equilibrium establishes between all possible 
mutually convertible more or less stable forms. This conversion occurs through a set of 

very unstable and ephemeral states of mutual reorientation and rearrangement, as, for 

* 

example, A . The asterisk emphasizes its transient and both highly unstable and labile 

character. It can slip (remember? labile comes from the Latin for slip) toward C and B 
or dissociate into A . 



Figure 7. Interaction of pasta cylinders. 

We can evaluate stability experimentally by observing the pasta behavior and 
comparing the average life time of the stable states, as well as of the transition state, 
depending on the degree of chaos. The transition states are very short living. 

The main premise of the theory of transition state in chemistry is that the speed of 
the transformation from one stable state to another decreases with the height of the 

transition barrier. For example, the transition barrier from A to C is a function of the 

* 

energy difference between A and A . 
















37 


The exact numbers for X-systems can hardly be available and even if they were, 
they would be of no importance. What matters is the comparison of a few alternative 
pathways of transformation. Of course, as Michael Flynn noted in his book, the 
standardization of the society would drastically simplify predictions. 

A uniform, docile society is more predictable, and Theirs [clyologists’] forecasts would 

be simpler and more precise” [3A, p. 108], 

The problem with predicting the future is that you cannot ask whether an event X 
will happen unless it has already happened. Otherwise you do not have even a name for it. 
In your imagination you have to invent new chimeras and give them flashy names. This 
is still an open question whether there are patterns of historical events that cover all 
foreseeable and relevant future, and I believe that we should find the answer. 
Imagination, fantasy, and thought experiments are instruments in a laboratory for 
experimenting with the future. To think takes much less energy than to wage a war or 
build a sandcastle of a paradise on earth. Imagination is equally necessary for a chemist 
and for a detective. See [1A], 

In order to compare a few scenarios (called mechanisms or pathways in chemistry) 
we need some criterion to distinguish between more probable and less probable ones. In 
other words, we need some knowledge about the preferred direction of events, and the 
first step toward this knowledge is the admission that the preference exists. Physics is the 
science of possible and impossible. The Newtonian apple can fall down but cannot jump 
up from the ground on its own. Within an X-system, however, it can move against 
gravity from the ground into the basket. The Newtonian horse can neither stop nor turn 
on its own. Chemistry is less radical: it is about predicting the outcome of the horse race. 
History is all about twists and turns. 

The following example illustrates a special case of a preferable direction of 

events. 

A property of an object is called isotropic if it is the same in any direction and 
anisotropic if it changes. Animal fur, for example, is smooth in one direction and rough 



38 


the wrong way. A wooden board surface is smooth along the grain and rough across. 
Corduroy fabric is another example of a highly anisotropic surface. If an anisotropic 
object is placed on a rough surface, chaotic disturbance moves it in the direction of the 
minimal friction. In Figure 8, a brush with orderly bent (anisotropic) bristles is placed 
on a rough isotropic surface. Chaotic disturbances, symbolized by <-», result in a sliding 
of the upper piece in the direction of the least friction. 


JJJJJJJJJJJJ 

iiiiiiiiiiiiiiiiniiiiii 

1 — 


wmmmm 


iiiinnnrTiTiTiTinn 


mmmm 


Figure 8 Irreversible lability in the case of 
anisotropic friction 


Figure 9 presents a photo of a “brushmobile” (not related to Art Dog) capable of 
moving and even climbing up a slight slope. It has a couple of bent brass wire brushes 
fixed under a small vibrator. 



Figure 9 Brushmobile, 1, Brush; 2, Platform; 3, Vibrator 
with a battery inside. 


The working toy model moves because it dissipates the electric energy from the 
battery, wasting most of it on friction. It also moves without a vibrator if placed on a 
vibrating surface. On a planet with chaotically vibrating surface the brushmobile could 
transportation loads against gravity. It embodies an abstract pattern that would be very 
difficult to symbolize graphically. A simple arrow —* does not tell the whole story. This 





39 


emergence of regularity from chaos is the long known ratchet effect, widely used in 
technology, for example, in clockwork mechanism, as well as in cellular mechanics. 

NOTE: Readers with background in engineering could compare the brushmobile with 
the clever Ultrasonic Motor (USM), in which the circular movement is generated by an 
ordered circular wave of mechanical deformation of a surface in contact with another 
high friction surface. The literature on USM with discussion of anisotropy has been 
growing and some is on the Web. An USM-like analog of brushmobile could probably 
work on purely chaotic deformations and jolts, using anisotropic friction. 

As it was shown by Richard Feynman, no order can be produced from pure 
thermal chaos, unless the system is far enough from equilibrium. This may seem to 
contradict the history of technology with human mind as the source of order. We actually 
produce most of our mechanical work from thermal energy, but only until the system 
does not come to a halt. On a snapshot, our technology and the global civilization are in 
a steady state. Over a longer time segment, it is homeostasis, which is different from the 
steady state because of a series of distinctive transitions. The curves for the French 
Revolutios on page 34 correspond to homeostasis, not to a steady state. France has been 
inhaling and exhaling in a perturbed homeostasis ever since the Revolution. America, of 
course, is not exempt fonn riding a serpentine bumpy road and turning read, blue, and 
white in the face. 

Life on Earth must have come from somewhere. The aggregation of multiple 
weak labile bonds into strong labile bonds, in my opinion, is the crucial source of order in 
the genesis of life. We can say the same about the mind, which is essentially a device for 
perceiving weak but repeating regularities as configurations bunched together into 
recognizable patterns. This is the same type of process that creates river beds—an 
analogy noticed by C. H. Waddington, whose term chreod was borrowed by Thom. The 
Grand Canyon, however, would never be possible without gravity which is responsible 
for the asymmetry of the system. 

The precipice between river flow and social evolution seems as deep as the Grand 
Canyon itself. To bridge it is exactly the point. One of many possible definitions of X- 



40 


systems will be “ a system in which the Mississippi Delta (Grand Canyon is too 
phlegmatic) and social evolution are similar.” 


Let us take economy as an example, in which I am an observer, but no expert, 
however. I privately believe that the machine of economy, starting from ancient times, 
works as a brushmobile, pumping the distribution of wealth against gravity (entropy) to 
high inequality in spite of the alleged chaos of individual actions. While the market is 
chaotic, rent, wages, and profits are as asymmetrical as gravity. The chaos of economy 
can be compared with a distribution of hail falling from the skies, while the fall is as 
predetermined as the rise of the kite jerking and wiggling in the wind. 

The principle of the machine was described, on a different occasion, in the Bible 
as “For whosoever hath, to him shall be given, and he shall have more abundance: but 
whosoever hath not, from him shall be taken away even that he hath.” (Matthew, 13, 12). 
While the market is chaotic in the sense that deals are not globally coordinated, the sides 

are driven by the interplay of the 
instinct of survival and the instincts of 
domination and enrichment, both 
creating asymmetric friction to other 
side’s move. Our human nature is 
openly anisotropic: we want to slide 
in the preferred direction only: up. 
Even the suicide bomber hopes to 
ascend to heavens. 

Generator 



Figure 10 (“wealth pump”) 

Figure 10. “Wealth pump 

metaphorically illustrates the 
asymmetry of X-systems from the energy standpoint. Hydroelectric generator is driven 
by the difference of the water pressure on both sides of the dam. The energy brings into 
motion the pump which works against the water pressure in two communicating vessels. 
The entire system consists of two subsystems: the left non-equilibrium subsystem 


maintains the right one far from equilibrium. If the whole system is insulated, sooner or 



41 


later the differences in the levels of liquids will disappear. A more compact symbol for 
the system is seen in the upper part of Figure 10. 

To be more specific, the ideogram equally applies to: 

1. Production of electric energy from the hydraulic energy of a river or from 
mineral fuel and its use for work, 

2. Production of energy in living systems from sunlight or chemical compounds 
and using it for maintaining the integrity of the organism which is alive only while it is 
far from equilibrium. 

3. Maintaining the unequal distribution of wealth which characterizes all 
organized societies, starting at least from the Sumerian civilization. Dollar signs in 
Figure 10 symbolize that. 

4. The concentration of wealth supports not only the complex system of 
economics, but also arts and humanities, as well as the fonn of government, and 
ideologies, sometimes, conflicting. That was, probably, the underlying idea of Marx. 

Ironically, it applies also to moving ratchet devices. As newspapers have recently 
announced (2006), “the scientists made water flow uphill.” In fact, it was a case of a 
ratchet effect, [11C], which required a dissipation of heat, falling neatly under the 
ideogram in Figure 10, in the company of the brushmobile and all water pumps in the 
world. 


The difference and interplay between strong (covalent) and weak bonds in 
chemistry is a separate and very specific subject. More can be found in [1] and, in 
chemical context, on the Web. It is very meaningful in social context. People in the West 
are sometimes surprised by the tolerance of the millions of Muslims to the terrorism of 
Muslims against other Muslims. They were also surprised by the patience of the Soviet 
people toward the Communist dictatorship. I believe it is a rewarding subject to 
investigate in terms of the strength and multiplicity of social bonds in different societies. 

Freedom is a few (or even none) strong bonds in the see of fleeting lability. 
Tribal and repressed societies, under the watchful eyes of rituals, have little lability and a 
lot of strong bonds. 



42 


According to Darwinism, the deep nature of evolution is chaotic. This may be true 
at a microscopic level, but there is a deep asymmetry on a higher level: the species adapt 
in order to go on with living, not to collapse. The movie The March of the Penguins is a 
heartbreaking illustration how far the adaptation can go. As the brushmobile moves only 
forward, the species move only toward procreation and normally not to self-destruction. 

Along our contemporary views, which might be possibly disputed, plants do not 
possess any mind. The source of order for the plants is the sunlight, which is, unlike heat, 
an ordered form of energy. But how could life originate in the absence of mind or just 
human—or even animal—nature? I believe that the fonnation of multiple weak bonds is 
an example of purely chemical and, of course, mindless irreversibility. Each new weak 
bond in the course of prebiotic contributes to the overall coherence and adhesion of the 
system. Because it immediately rises the threshold of solidity of the system. 

Taking another example, once a religion takes a canonical shape, it is extremely 
difficult to take it apart and rearrange: it is hold together by a large number of cross- 
references. Still, it is possible, as reformist religions demonstrate. 

The weak labile bonds, like those governing the behavior of the cooked pasta, are 
the very beginning of the genesis of evolution. But this is not the whole chemical story, 
for which one should better look into chemistry. In a nutshell, strong chemical bonds are 
typically inert. They do not dissociate easily into disconnected atoms (A—B A + B) , 
which is highly unstable, but rearrange in such a way (A—B + C—D A—C + B—D ) 
that the overall change of stability is relatively small. If we were built with weak bonds 
only, we would dissolve in a bathtub. We can say that a strong bond is labile if the 
quartet-like rearrangement runs fast, which is usually done by a catalyst, whether in a 
social or biochemical system, by decreasing the transition barrier. In social X-systems 
the catalyst is what Hannah Arendt called homo faber, man the maker, the tenn Thom 
also used. 

Similar to the chemistry of life, the social and political chemistry consists of the 
interplay between the weak labile bonds, which lock and break by the hour, and strong 
inert bonds of allegiance to family, tribe, institution, or party, which can nevertheless be 
broken in an act of exchange, exemplified by a love affair, divorce, treason, or just 



43 


changing employment. In spite of all declarations of individualism, the modern human is 
extremely unstable in social isolation, so that the true meaning of individualism is lability, 
not stability. 

Can the immense and ever growing concentration of global capital prepare the 
soil for the seeds for Michael Flynn’s cliology? Can a few world tribes inhabiting the 
small impoverished planet be a plausible future? Is the American Empire doomed to 
cede its position of domination to another empire? Words, words, words again. But the 
question mark is half the answer. Nothing like that has ever happened, but each 
possibility is an alternative stable state. Anything can happen. We have to understand that 
what is most probably going to happen is what can happen faster. Whether we can list all 
possible final states is an open question in an incomplete system of knowledge. I believe 
that we cannot, but I do not believe we can ever have a proof one way or the other and I 
do not believe we can have a proof that we can or cannot have a proof, etc. Aristotelian 
logic is novelty-blind. We can play, however, with a Lego set of building blocks, 
constructing and testing various alternatives for initial, final, and transition states, 
checking them against a partially ordered scale of stability for a line-up. This is exactly 
what Sci-Fi is doing, following in the steps of a chemist who plays various chemical 
scenarios and compares the configurations—stable and unstable—by energy. 

Ulf Grenander has always been a keen observer of the world behind the windows 
and, as his books testify, a real connoisseur and navigator of Everything. In [1M], he and 
I (also in [IB]) made an attempt to try on the conceptual goggles of Pattern Theory and 
look through them into the past. Alas, vision, unlike sound, does not generate an echo. But we have 
all the time ahead. It is tny impression that the skeleton of the past history can be clearly 
seen in the double-locked cupboard of the future, but the second key is available only to 
professional historians. 

Next, I would like to summarize the main lesson extractable from cooking pasta 
and driving brushmobile. It seems that there is no connection unless we regard both cases 
as configurations covered by the pattern of irreversible non-equilibrium process. The two 



44 


processes themselves are strikingly different—mechanical movement and growth of 
complexity—but both require supply of energy to keep them off equilibrium. 

In order to descend somewhat from the dizzying heights of abstraction, we need 
to point to the source of non-equilibrium for the primordial pasta, i.e. , the natural analog 
of shaking. I believe that this meta-shaking came from periodic change of tide, day and 
night, weather, and seasons that kept all slow chemical processes at a constant non¬ 
equilibrium because the position of equilibrium depends on temperature and 
concentration. This could be possible only if the rate of primordial chemical reactions 
was slower than the rate of the change of conditions. The situation can be compared with 
the contest of a hare and a tortoise moving back and forth between two points: they 
would never be side by side except for very short moments. 

In the end, as an example of a possible alternative pattern of further long run 
course of history, I am submitting my strictly personal view. Please note the blocks of the 
mental configuration. 

From the point of view of a chemist life originates because chemistry is slower 
than astronomy. Life evolves further because the slow rate of chemical reactions makes 
catalysis—highly selective acceleration of some of them—a powerful source of order. 
Life increases complexity because multiple labile bonds hold together large aggregates. 
Society continues the work of life in the same vein, with homo faber in the role of a 
catalyst and mineral fuel instead of sun. The next step is open to imagination. I personally 
believe that today man-made things are the dominating component of the new civilization, 
money shines as the eternal Sun, and the human being is more faber than homo, more 
enzyme than DNA. If the resources of mineral fuel are depleted, sun-powered Things 
have an evolutionary advantage over heavy, errant, and voracious humans who, with their 
liquid-filled heads, will remain as a source of chaos necessary for further adaptation 
through mutating social DNA. Biosphere, formerly dominated by life, then by social life, 
then by exploding ideas, turns into technosphere. 



45 


I wish to draw attention that what is stable today may not be stable at changing 
conditions. The fate of the brushmobile individualism strongly depends on the quality of 
the batteries and if you are low on energy, the king or a feudal lord may take you as a 
vassal. Or you are welcome to earn your pasta dinner as a laborer—if the nation’s 
batteries are almost dead. The global change from monarchy and conformism to 
democracy and individualism was done on mineral fuel. At some stage of resource 
depletion the trend could be reversed. 

Purely intuitively I see the ongoing concentration and consolidation of capital as 
the first stage of involution. 

The guns speak when the national batteries are low, and when they speak the 
muses are silent, mathematicians work on weapons, and philosophers are hunkering 
down in the trenches. 


CONCLUSION 

A scattered array of talking points and a few making points amount to the 
following main hypotheses. 

1. As the century of debates testifies, the traditional formal (axiomatically closed) 
physical and mathematical theories have little power over Evolving Complex Systems 
(X-systems) because such theories are devoid of the notion of novelty. The X-system 
changes while we speak and should be analyzed in terms of novelty and difference. 

2. Understanding of an X-system in terms of novelty might be possible while a 
larger X-system remains stable. 



46 


3. Understanding of an X-system might be possible in a standard—but 
expandable—set of ideograms (generalized metaphors) that symbolize their most 
abstract properties. 

4. The standardization of language and a consensus regarding the ideograms 
might be achieved by evaluating them along their complexity. 

5. Complexity of the abstract space housing the X-system might be a more 
appropriate measure for an X-system than the complexity of objects and states. From this 
point of view, evolution is expansion (or collapse) of the generator space. 

6. Configuration space of Pattern Theory is an example of an open space housing 
the structures from molecules to thoughts and with a measure of stability. 

7. The main task in understanding X-systems is to shift the emphasis from 
thennodynamics to kinetics. This can be done, following the pattern of chemistry, by 
comparing alternative transition states. 



47 


REFERENCES 


The advent of the Web is changing the playgrounds where ideas frolic, mate, and 
multiply. The Web does not make the landscape flatter, just the opposite: one can fall 
through the cracks in the constantly shifting and swelling geological maze. In order to 
get out of the crevice, one would need a tug of a crew of links. The Web, however, has a 
kinetic advantage: it brings you to a distant planet in a few clicks, and then it is up to you 
whether to dig in or immediately leave. 

If the topic is X-systems, one can have nine lives and still be unable to catch up 
with the literature. 

Instead of the traditional references, I supply only a few selected ones, grouped by 
topics, because the rest can be easily googled from names, terms, and subjects. 


1. Yuri Tarnopolsky 

At http://www.spirospero.net/compIexity-htm 

A. Molecules and Thoughts: Pattern Complexity and Evolution 

in Chemical Systems and the Mind (2003) alternative site 

B. TRANSITION STATE IN PATTERNS OF HISTORY (2003) 

C. TIKKI TIKKI TEMBO and the Chemistry of Protolanguage 

(2004) 

D. Pattern Theory and “Poverty of Stimulus” Argument in Linguistics 


(2004) pdf Abstract 










48 


E. The Three Little Pigs : Chemistry of language acquisition (2005) 

F. Salt: The Incremental Chemistry of Language Acquisition (2005) 

G. Molecular computation: a chemist’s view (2005) 

H. Salt 2: Incremental Extraction of Grammar by Simplistic Rules (2005) 

J. The Chemistry of Semantics (2005) 

K. The New and the Different (1995) 

L. Yuri Tarnopolsky and Ulf Grenander. History as Points and Lines, (1998- 
2003) 

At http://spirospero.net/simplicity.htm 

M. Informal Essays on various big and small topics from an ideogrammist’s 
angle. 

2. Hannah Arendt 

1958. The Human Condition. Chicago: University of Chicago Press. [This book is 
completely fresh today]. 

3. Predictions 

A. Flynn, Michael. 2001. In the Country of the Blind. New York: A Tom Doherty 
Associates Book. [A large Afterword is an informal introduction into formal 
theories of history]. 

B. Wattenberg, Ben J. 2004. Fewer: how the new demography of depopulation 
will shape our future. Chicago: Ivan R. Dee. {I explained depopulation in [1M] as 
competition between humans and things}. 

C. Diamond, Jared. 2005. Collapse: How Societies Choose to Fail or Succeed, 
New York: Viking Books. [This book is very different from the somewhat 
tautological Guns, Germs, and Steel], 











49 


4. Gregory Chaitin 

A. 2005. Meta Math! The quest for Omega. New York: Pantheon 

B. From Philosophy to Program Size. Lecture Notes on Algorithmic Information 
Theory, Estonian Winter School in Computer Science, 2003 [Part of Chaitin’s 
web page] 

http://www.umcs.maine.edu/~chaitin/eesti.html 

5. Ulf Grenander 

A. 1995. Elements of Pattern Theory. Baltimore: Johns Hopkins University Press. 
[It is written for students of mathematics, but large sections and main ideas are 
accessible to non-mathematicians]. 

B. 1993. General Pattern Theory. A Mathematical Study of Regular 
Structures, Oxford, New York: Oxford University Press. [Advanced]. 

C. Patterns of Thought. www.dam.brown.edu/ptg/REPORTS/mind.pdf Watch 
for updates; see also: www.dam.brown.edu/ptg/publications.shtml 

D. Grenander, Ulf. 1976. Pattern Synthesis. Lectures in Pattern theory, Volume 1. 
New York: Springer-Verlag. Also: 1978, Pattern Analysis. Vol. II, and: 1981. 
Regular Structures. Vol. III. [Advanced, but very rich of general ideas]. 

6. Werner Heisenberg 

1958. The Physicist's Conception of Nature. New York: Harcourt, Brace & Co. 

Also published: 1970. Westport, CT: Greenwood, 1970. [Among other things, 
some striking ideas on the future of technology.] 





50 


7. Rene Thom and related publications 

A. Thom, Rene. 1975. Structural Stability and Morphogenesis: An Outline of a 
General Theory of Models. Reading, Mass.: W. A. Benjamin, Inc. 

B. Thom, Rene. 1973. La theorie des catastrophes: Etat present et perspectives. 
Manifold 14 (1973); reprinted in Dynamical Systems, Warwick 1974 , A. Manning, 
Ed., pp. 366-372. Berlin: Springer-Verlag; also reprinted in Rene Thom, Apologie 
du logos. Paris: Hachette, 1990 and in: Zeeman, E.C. 1977. Catastrophe theory: 
Selected papers, 1972-1977. Reading, MA: Addison-Wesley. 

C. David Aubin and Amy Dahan Dalmedico. Writing the History of Dynamical 
Systems and Chaos: Long Duree and Revolution, Disciplines and Cultures, 
Historia Mathematica, 29 (2002), 273-339. 
www.amath.washington.edu/courses/572-spring-2003/aubin.p df 

D. David Aubin. 2004. Forms of Explanations in the Catastrophe Theory of Rene 
Thom: Topology, Morphogenesis, and Structuralism. Growing Explanations: 
Historical Perspective on the Sciences of Complexity, ed. M. N. Wise, Durham: 
Duke University Press, 2004, 95-130. 

http://www.math.iussieu.fr/~daubin/publis/Aubin Catastrophe.pdf 

E. J. J. O'Connor and E. F. Robertson. Rene Thom 

http://www-history.mcs.st-andrews.ac.uk/historv/Mathematicians/Thom.html 

8. Steven Wolfram 

A New Kind of Science, http://www.wolframscience.com/ [Clear dissatisfaction 
with formal systems]. 

9. Complex systems: old and new ideas 

A. David Aubin, important publications at homepage: 
http://www.math.jussieu.fr/~daubin/publis.html 







51 


[Among them, links to the uniquely rich and deep analysis of the extra-formal 
sources of “a different kind of science” in:] 

1998. Aubin, David. A Cultural History of Catastrophes and Chaos: Around the 
Institut des Hautes Etudes Scientifiques, France 1958-1980, Ph. D. thesis 
(Princeton University), UMI #9817022. Available on the Web. [ It tells much 
more than the title does]. 

B. Ottino, J. M. 2003. Complex Systems. AIChE Journal [American Institute of 
Chemical Engineers ], Vol. 49, February 2003, No. 2 

C. Yagil, Gad. Complexity and Order in Chemical and Biological Systems. 
http ://www.weizmann.ac. il/home/lcyagil/order.html 

10. Selected works related to ideograms 

A. Talmy, Leonard. 2000. Toward a Cognitive Semantics. Cambridge, MA: MIT 
Press. Homepage: http://linguistics.buffalo.edu/people/facultv/talmv/talmyweb/index.html 

B. Sandstrom, Karin. 2006. When motion becomes emotion: a study of emotion 
metaphors derived from motion verbs. 

http://epubl.ltu.se/1402-1552/2006/022/index-en.html (Abstract) 
http://epubl.ltu.se/1402-1552/2006/022/LTU-DUPP-06022-SE.pdf 

C. Poli, Roberto. Iconic Graphs: An exercise in topological phenomenology. 
http://www.mitteleuropafoundation.it/Papers/RP/Iconic graphs.pdf 

Homepage: http://www.mitteleuropafoundation.it/Poli.html 

D. Wildgen, Wolfgang. Time, Motion, Force, and the Semantics of Natural 
Languages. http://webhost.ua.ac.be/apil/apill06/WolfgangWildgen.pdf 
Homepage with many interesting wide-angle works: 

http://www.fb 10.uni-bremen.de/homepages/homepagebyid.asp?id=34 










52 


11. Selected publications on ratchets 

A. Martin Bier, East Carolina University. Personal page with links to 
publications http://personal.ecu.edu/bierm/ 

B. Raymond Dean Astumian. University of Maine. CV with publications, [no 
links, but many are available on the Web], 

www. cnsi-uc. org/events/nano seminar/C Vs/astumian c v .pdf 

C. Heiner Linke, University of Oregon. Detailed and infonnative homepage with 
many publications available on the Web. http://darkwing.uoregon.edu/~linke/ 
Climbing droplets: http://www.uoregon.edu/~linke/climbingdroplets/index.html 

D. Peter Reiman, Universitat Bielefeld , Germany. Homepage: 
http://www.phvsik.uni-bielefeld.de/theorv/cm/people/members/preimann.html 

Page on ratchet effects: 

http://www.physik.uni-bielefeld.de/~reimann/RESEARCH/ratchet.html 

Some pdf files are available on the Web.works 

12. Bow-tie structures 

A. Marie Csete and John Doyl, Bow ties, metabolism and disease, TRENDS in 
Biotechnology ,Vol.22 No.9 September 2004, 
http://www.hot.caltech.edu/Trends.pdf 

B. Jing Zhao, Hong Yu, Jian-Hua Luo, Z. W. Cao ,Yi-Xue Li. Hierarchical 
modularity of nested bow-ties in metabolic networks. BMC Bioinformatics 
2006, 7:386 . http://arxiv.org/pdf/q-bio.MN/0605003 

C. Debora Donato, Stefano Leonardi, Stefano Millozzi, Panayiotis Tsaparas. 

Mining the Inner Structure of the Web Graph. (2005) 
http://delis.upb.de/specials/paris05/paper/ECCS05 Donato.pdf 











53 


http://webdb2005.uhasselt.be/slides/S-9.pdf 

D. Chris Sherman . New Web Map Reveals Previously Unseen ‘Bow Tie’ 
Organizational Structure. (2000). 
http://www.infotodav.com/newsbreaks/nb000522-l.htm 


First draft: May, 2006 Updated (bow-tie structures): October, 2006 

A few of numerous misprints and mistakes are corrected: August, 2007 

Author’s web site: http://spirospero.net 



Rusty bolt: http://www.eleventwentyseven.com/rustednut.ipg 






MATLAB CODES for topological analysis of text and 
grammar extraction in 

complexity (http://spirospero.net/complexity.htm) 

REMINDER: start the input string P with ’START' and end with ’END'); 

See http://spirospero.net/Salt2.pdf for use 


pg2.m 

%pg : gives lists of words as NAME, 

%and LEFT and RIGHT neighbors; 

%array sizes made repetitive for other purposes. 

%Text should be preloaded as character %string P=char('abc', 'bad', etc.) 
tic 

zname=[]; z=1; 

words=[]; W = P; wname=P; %wname contains all names with 

%repetition 

WW=[]; %comparison cell; compare W=P and wname=P 


sw=size(W); sww=sw(1,2); 






twd- ' ; % twd is word template, gives empty word 


for k=1 :sww, twd(k)=char(''); end, t=twd; 

%t is an array of empty words 
words(1 ).name= char(t);w=0; m=0; 

%empty initial array of names 

lenW = length(W); % number of words in text 

for i=1: lenW, m=0; WW=W(i,:); z=1; 
for j =1 :lenW 

WWW=W(j,:); k=isequal (WW, WWW); %WWW is a comparison cell 

if k==1 & j>i, 

wname(j,:)=char(t); m = 1; %inserts a gap into text 
z=z+1; 
end, 

end, zname(i)=z; 
end 

res=length(wname); w=0; 
for f=1 :res, 

if isequal (wname (f,:), char(t)) ==0, 


w=w+1; words(w).name = wname(f,:);Z(w)=zname(f); 



end, 


end 

InW = length(W)-1; lew = length (words); NL=1 ;NR=1; 
words(1 ).L(1 )= {'#'}; 

words(1).R(1)= {words(2).name}; % included words(1).L(1 )={'#'}; 
for k= 1 :lew, NL=1; NR=1; 
for j= 2:lnW, 

d = isequal (words(k).name, W(j,:)); 
if d==1, 

words(k).L(NL)={W(j-1 ,:)};words(k).R(NR)={W(j+1 
NL=NL+1;NR=NR+1; 
end 

end 

end 

%display NAME 

disp(blanks(2)'), disp ('WORDS-NAME'), disp(blanks(2)') 
F=char('00'); 

for j=1 :lew,F=num2str(Z(j));FF=words(j).name; 


disp(F),disp(FF) 



end 


%disp(blanks(2)') %for r=1 :lew, disp (Z(r)),end 
pgwrk %script to display L and R neighbors 
toe; t=toc 

pgwrk. m 

%pgwrk displays LEFT and RIGHT neighbors; uses pigwork.mat 
disp(blanks(2)'), disp ('WORDS-LEFT'), disp(blanks(2)') 
f = char(words.name); lw=length(f); 
for m=1 :lw, 

A=words(m).L; %list of LEFT neighbors with repetitions 
A=deblank(A); 

CS=A; 

%%%%%%%%% 

pigwork, %counts repetitions 
%%%%%%%%% 

A=CS; 

wordsa(m).L=A; 


end 



for mm=1 :(lw-1), 


S=wordsa(mm).L; QL=size(S); sl=QL(2)-1; 

SK=wordsa(mm).L(1); 
for k=1 :sl, 

B=SK; D= wordsa(mm).L(k+1); SK=strcat(B, , ,',D); B=SK; 

end 

CAS=SK; CAS=strrep (CAS,'); DDD= char(CAS); disp (DDD), 
end 

%% 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o 0 /o%%% 

disp(blanks(2)'), disp ('WORDS-RIGHT'), disp(blanks(2)') 
for m=1 :lw, 

A=words(m).R; %list of RIGHT neighbors with repetitions 
A=deblank(A); CS=A; 

%%%%%%%%% 
pigwork,%counts repetitions 

%%%%% 

A=CS; wordsa(m).R=A; 
end 


for mm=1 :lw 



S=wordsa(mm).R; QR=size(S); sr=QR(2)-1; SK=wordsa(mm).R(1); 


for k=1 :sr, B=SK; D= wordsa(mm).R(k+1); 

SK=strcat(B,',',D); B=SK; 
end 

CAS=SK; CAS=strrep (CAS,'); DDD= char(CAS); 
disp (DDD) 


pigwork.m 

CS; CSS=unique(CS); lcs=length(CS); lcss=length(CSS); c=0; 
for s=1 :lcss, %base word 
c=0;%start count 

for k=1 :lcs %comparison word 
if strcmp(CSS(s),CS(k))==1, 
c=c+1; %count 

end %of counting 

end %of checking and counting of CS(s) 


if c>1, % if c==1, do not change CSS 



cac=int2str(c); %converting into string 


D=strcat(cac,'-',CSS(s)); CSS(s)= D; 
end 
end 

CS=CSS; 

ms.m 

%mindset script: ms 

disp('REMINDER: start the input string with "START" and end with "END"'); 
tic; space=char(160); 

%PART 1 NAME ; unique list UNAME of G for G.name, 

A=cellstr(P); G=[]; Q=[]; LP=length(P); 
uA=unique(A); luA=length(uA); zer=[]; 
form=1:luA, count=0; 
for n=1 :LP 

if isequal (uA(m), A(n)), count=count+1; 

if count>1, zer=cat(2,zer,n);end, 
end 


end, 



end, %zer 


AL(1 )=cellstr('START'); AL([2:LP])=A([1 :(LP-1)]); 
AR(LP)=cellstr('END'); AR([1 :(LP-1)])=A([2:LP]); 

UNAME=A; UNAME([zer])=[]; lg=length(UNAME); 

%PART 2: L&R 

all(1 )=1; arr(1 )=1 ;ALL=[];ARR=[]; 

for k=1 :lg, % NAME from ANAME IP 

a=UNAME(k); al=[]; ar=[]; % first neighbors; correct ends later! 
cnl=0;cnr=0; 

for j=1 :LP %search for neighbor copy 
b=A(j); %br=AR(j); bl=AL(j); 

if isequal (a,b)==1, 
al=cat(2,al,j); ar=cat(2,ar,j); 
end 
end 


NB(k).L=al; NB(k).R=ar; % 


L and R neighbor list 


end 


%PART3 copy count NAME, L&R 



for s=1 :k 


G(s).name=UNAME(s); 

zz=strmatch((UNAME(s)), A,'exact'); G(s).NQ=length(zz); G(s).AQ=zz; 

%%%%%%%%% 


X=AR([NB(s).R]); Y=unique(X); lub=length(Y); 
output=[]; numr=[];RN=[]; 
for qq=1 :lub, 

mtch=strmatch(Y(qq),X ,'exact'); 
lmtch=length(mtch); 
w= Y(qq); 

if lmtch>1,output=strcat(output,num2str(lmtch),'-', w(1),';', space); end 


if lmtch==1, 

output=strcat(output, w(1),';', space); 
end 


%output=strcat(space,output,num2str(lmtch),'-', w(1),';', space); 



numr=cat(2,numr,lmtch); 


wwr=cellstr(w(1)); 

wu=strmatch (wwr.UNAME,'exact'); RN=cat(2, RN, 
wu(1)); %%%%%%%%%%%%%%% 

end, 

G(s).nr = numr; 

G(s).R=cellstr(output); 

G(s).ReN=RN; 

X=AL([NB(s).L]); Y=unique(X); lub=length(Y);%####; 
output=[];numl=[]; LN=[]; 
for qq=1 :lub, 

mtch=strmatch(Y(qq), X, 'exact'); 

lmtch=length(mtch); 


w= Y(qq); %%%%%%%%%%% 



if lmtch>1, 


output=strcat(output,num2str(lmtch),'-', w(1),';', space); 
end 


if lmtch==1, 

output=strcat(output, w(1),';', space); 
end 


numl=cat(2,numl,lmtch); 


wwl=cellstr(w(1)); 


wu=strmatch (wwl.UNAME,'exact'); LN=cat(2, LN, wu(1)); 


end, 


G(s).L=cellstr(output);°/c 
G(s).nl = numl; 

G(s).LeN=LN; 

end % structure G.name/L/R 


G(1 ).L=cellstr('START'); G(lg).R=cellstr('END'); °/ t 



%%%disp('Type dsgw to display output for comprehensive 10 column table of 
G'); 

%%%disp (Type dsgn to display output for narrow 5 column table of G'); 
t=toc; 

tm=strcat('P = num2str(LP), ' G = ', num2str(lg), ' time',':',space, num2str(t)) 
disp(tm); 

cblr 

%cblr lists 
%tic; 

space=char(160); 

CB=[]; CCR=[]; CCL=[]; %structure CB stores bonds and cats: CB.bond, CB.cat 
g=1; gc=1; gl=1 ;gcl=1; 
for k=1 :lg, 

L=G(k).nl ; M=G(k).NQ; R=G(k).nr; kgr=G(k).ReN; kgl=G(k).LeN; 


%BONDS 


lr=length(R); ll=length(L); % 



for j=1 :lr, 


if ((M >=R(j)) & ( R(j)>=2)),CB(g).bond=[k kgr(j)];g=g+1; 
end , 
end 

%CATS (i.e., categories or classes) 

if lr>1, CCR(gc).cat=k; CCR(gc).catr=[kgr]; gc=gc+1;end 
if ll>1, CCL(gcl).cat=k; CCL(gcl).catl=[kgl];gcl=gcl+1; end 
end 

%DISPLAY ADAPTED FOR TABLE FORMAT 
% copy to document,find and replace A p A p for A p, convert to table, 

% find and replace double spaces for single spaces, repeat until none found 
lcb=length(CB); lccr=length(CCR);lccl=length(CCL); 
disp(space) 

disp( 'BONDs'),%prepared for 2-column table; 
disp(space) 


%for c=1:lcb, disp(CB(c).bond), end 



%%%for c=1:lcb, disp(CB(c).bond), BB= strcat(G([CB(c).bond]).name); 
disp(char(BB)),end 

for c=1:lcb, disp (num2str(CB(c).bond)), BB= strcat(G([CB(c).bond(1 )]).name,'+\ 
G([CB(c).bond(2)]).name); disp(char(BB)), end 

% for c=1 :(length(CB)), disp(CB(c).bond), BB= strcat(G([CB(c).bond(1)]).name, 
G([CB(c).bond(2)]).name); disp(char(BB)), end 

disp(space) 

disp('RIGHT CATs' ), %prepared for 4-column table; 

disp(space) 

for c=1 :lccr, 

%%%%disp (CCR(c).cat), 

%%%disp(CCR(c).catr), 

CTR=G(CCR(c).cat).name; 
disp(char(CTR)), 

DCTR=[];lcatr=length(CCR(c).catr); 
for cr=1 :lcatr, CTR=CCR(c).catr(cr); 

DCTR = strcat(DCTR,G(CTR).name, space); 


end, 



disp(char(DCTR)), 


end 

disp(space) 

disp('LEFT CATs' ), %prepared for 4-column table; 

disp(space) 

for c=1 :lccl, 

CTL=G(CCL(c).cat).name; 

% d i s p (c h a r (C T L)) % % % % % % % % % % 
CCTL=char(CTL); 

%%%%%%%%%%%%%% 
DCTL=[];lcatl=length(CCL(c).catl); 
for cl=1 :lcatl, CTL=CCL(c).catl(cl); 

DCTL = strcat(DCTL, G(CTL).name,';',space); 
end, 

disp(char(DCTL)), disp(CCTL) 


end 



disp(space) 


%toc;t=toc; 

%CB ,CCL,CCR,CTR, CTL 

dsgn.m 

%dsgn displays data for a narrow 5-column table; convert text to table with MS- 
Word 

for dp=1 :lg, 

disp(num2str(dp)), 

disp(char(G(dp).L)), 

disp (char(G(dp).name)), 

disp(num2str(G(dp).NQ)), 

disp(char(G(dp).R)) 

end 

%script dsg : displays G for a comprehensive 10-column table 

dsg.m 

%script dsg : display data for a comprehensive table (redundant for common use) 
%PART 4 : DISPLAY 


disp('The output can be pasted into MS Word document and converted ') 



disp(' into a comprehensive table of 10 columns ') 


disp('ATTENTION: Remove empty lines before conversion') 
disp( 'with: "find and replace A p A p for A p'") 

disp('COLUMN HEADS: 1. Generator number in G; 2.left neighbors; 3. left 
neighbor positions in G;') 

disp ('4. occurrences of left neighbors in input P; 5. generator name; 6. 
generator positions in input P;') 

disp ('7. right neighbor positions in input P; 8. occurrences of right neighbors 
input P'); 

disp ('9. positions of right neighbors in G; 10. right neighbors;') 


%if DT==1, 


for dp=1 :lg, 

%FOR COMREHENSIVE TABLE, 10 column 


%1 

disp(dp) %NAME ordinal number 
%2 


disp(char(G(dp).L)) 



%3 


d i s p (G (d p). L e N) % % % % % % % % % % % % % % % % % % % % % 
%4 

d i s p (G (d p). n I) % % % % % % % % % % % % % % % % % % 

%5 

disp (char(G(dp).name)) 

%6 

disp ( G(dp).NQ) 

%7 

disp ((G(dp).AQ')) 

%8 

d i s p (G (d p). n r) % % % % % % % % % % % % % % % 

%9 

d i s p (G (d p). R e N) % % % % % % % % % % % 

%10 

disp(char(G(dp).R)) 

end 

% for dp=1 :lg, disp(char (G(dp).name)), end; 


% for dp=1 :lg, disp(G(dp).QN), end; 



% for dp=1 :lg, disp(G(dp).AN), end; 

% for dp=1 :lg, disp(G(dp).nl), end; 

% for dp=1 :lg, disp(char (G(dp).L)), end; 
% for dp=1 :lg, disp(G(dp).LeN)), end; 

% for dp=1 :lg, disp(char (G(dp).R)), end; 
% for dp=1 :lg, disp(G(dp).ReN), end; 

% for dp=1 :lg, disp(G(dp).NQ), end; 

% for dp=1 :lg, disp((G(dp).AQ)'), end; 

% [1:lg]' %line numbers 
%end 


email 








2 


PATTERN CHEMISTRY OF THOUGHT AND SPEECH 

by Yuri Tarnopolsky 
SUMMARY 

The parallel between linguistics and chemistry has been drawing attention since the 
discovery of the DNA’s structure and its ability to carry a protein "meaning." Besides, 
chemistry uses a particular language (chemical nomenclature) to convert complex non¬ 
linear structures into a linear word which can potentially be communicated through 
speech. 

The present e-paper continues an investigation of thought, language, and conversion of 
one into the other within the framework of Pattern Theory of Ulf Grenander. This 
mathematical theory studies structural complexity regardless of interdisciplinary 
boundaries. It reduces structure to a set of atomic entities selectively connected by bonds, 
thereby representing observable objects of widest variety, including language and thought, 
quite like a generalized chemistry. The unusual aspect of Pattern Theory is its metrics 
which allows for distinguishing between more and less stable structures. Pattern 
Chemistry focuses not so much on stable structures as on the fleeting transition states 
between them, similarly to the way chemistry treats molecular transformations, making 
distinction between fast and slow transfonnations. 

The central ideal of Pattern Chemistry is that complexity in nature and society evolves 
from simple states changing by simple steps. 

Unlike speech, thought is not observable. The paper further explores a hypothetical 
protolanguage, called Nean, in which the simple elementary thought consisting of two 
connected entities directly translates into the simplest elementary phrase consisting of 
two words. Nean sounds like a repetitive random series of elementary doublets. Two 
doublets with common element can be combined into linear triplets. The paper explores 
the ability of this inherently both linear and primitive language to express more complex 
non-linear thoughts by means of the process of linearization. It appears that Nean, 
subjectively, is quite expressive. 

A computer simulation based on simple principles represents thinking and speech as a 
competition of alternative thought structures for a spot in consciousness and further 
generation of linear speech-ready expressions longer than elementary doublets. The story 
of Three Little Pigs serves as the substrate of the process. The potential of Nean for 
further complexification and grammaticalization is discussed. Nean is regarded as a 
point of divergence between thought and speech and origin of the variety of grammars. 



3 


CONTENTS 


1. INTRODUCTION 4 

2. THE NEAN LANGUAGE 7 

3. LINEARIZATION 10 

4. LISTENING 18 

5. THINKING 24 

6. SPEAKING 29 

7. UNDERSTANDING 34 

8. CONCLUSION 42 

APPENDIX 45 



4 


Last updated: November, 2010 


1. INTRODUCTION 


This e-paper continues an exploration of Pattern Theory (Ulf Grenander) as generalized 
chemistry. My previous e-publications are listed in complexity , see APPENDIX 1. 

They are duplicated on SCRIBD and could be found at some other sites on the constantly 
eroded by the flow of time Web. Referred authors can be also found through Web search. 

As there was a common ancestor for humans and apes, there could have been one for 
thought and speech. How would it sound? What is the difference between thought and 
speech and, if there is any, how could they have diverged in evolution? 

Unfortunately, the sounds of ancient speech cannot be heard. As for thoughts, although it 
is fortunate for all of us that we cannot “read” or “hear” them, it does not help us with 
understanding what thought is. By thought we typically mean the content of a verbal 
expression. Sometimes, however, we are stumbling at expressing our own thoughts. 

Turning to origin of language, we must disregard written language and refined educated 
speech without which Chomskyan analysis would make but a few short steps. For most 
of history, even after invention of writing, the vast majority of people were illiterate. 

I assume that thought and speech are essentially different. Until we decode thoughts, we 
may not know the truth, but I point to the fact that two people speaking very different 
languages seem to think in the same way because they can communicate their thoughts in 
a third language. Did thinking of an immigrant from Russia, like myself, change after 
abandoning the native language environment and switching to English? After a few years 
in America, I suddenly noticed that in my spontaneous thoughts people from my 
Russian past, who did not know a word of any foreign language, were speaking English. I 
am also certain that thinking of a chemist or a fan of music can be mostly non-verbal. I 
realize that I have no proof of either assumption because I cannot open my mind for the 
reader. But can I at least speak my mind? 

I have been interested in those questions for a very long time. The central idea took shape 
around 1980, which coincided with my discovery of strikingly “chemical” Pattern Theory. 

I was greatly influenced by (1) Ulf Grenander’s work on GOLEM (Patterns of Thought) , 
which I was lucky to watch closely, and (2) the development of the concept of transition 
state in organic chemistry, which coincided with my years as a graduate student. Some 
other major influences are: 






5 


(3) Manfred Eigen’s concept of chemical evolution (since 1971); 

(4) The Heraclitean world of novelty (further explored by Ilya Prigogine), as opposed to 
the logically closed world of Aristotle; 

(5) Bourbaki’s exotic concept of the scale of sets (rarely, if ever, remembered), 

(6) Ross Ashby’s ideas of stability and homeostasis (remembered, but not appreciated). 

(7) Rene Thom’s idea of most abstracts patterns of change (first abandoned, recently 
revived, especially by linguists). 

My major factual sources are decades of life in both Soviet and American systems, 
linguistics of dissimilar languages, and the overall history of artificial intelligence and 
science of complexity, the infancy, promise, and missteps of which I observed live. 

Although I am putting forward the simple idea that complexity is in fact an evolving 
simplicity, to elaborate this thesis would require a lot of words. I focus here only on 
some preliminary results, omitting most of the preamble, which can be found in 
complexity . Occasionally I will embed some digressions, as for example: 


SOME PRINCIPLES OF PATTERN CHEMISTRY 

1. Pattern chemistry is representation of the world in terms of elements and bonds. 
It is indifferent to the specificity of areas of knowledge, putting on equal basis 
molecules, thoughts, social and economic phenomena, and anything else having a 
structure. It lies within the much larger domain of Pattern Theory and uses it as a 
conceptual platfonn. 

2. Pattern chemistry borrows general ideas of Pattern Theory and combines them 
with the chemical concept of transition state: stable configuration A changes into 
stable configuration B through an unstable transition state [A , B], The transition 
state is irregular in the sense of Pattern Theory. 

3. Pattern chemistry is a study of exystems—evolving complex systems. The 
chemical contribution to the problem of complexity is: complexity evolves from 
simplicity in a sequence of simple steps. If this statement evokes the idea of 
algorithm, there must be something to it, only without either the programmer or 
the computer. 

4. Pattern chemistry is not supposed to have a closed set of axioms. It places on 
the foreground the little explored phenomenon of novelty (see The New and the 
Different) . King Solomon was, apparently, the first neologist, of a negative type. 

Exystems are unique, large, naturally (i.e., slowly and spontaneously) evolving 
systems open in both thennodynamic and logical sense. They change while we 
speak about them, thereby challenging Aristotle and hailing Heraclitus. 
Generalizing the paradoxical Greek idea discussed by Aristotle in Ethics 






6 


(“Nobody can be called happy until he dies,”) the history of the exystem is its 
only explanation, never complete until the exystem is out of existence, leaving 
only the shells of its patterns. 

The inherent openness of exystems to novelty requires a decisively open mind 
from a natural scientist because it pulls the rug of theoretical solidity from under 
the feet and a possible grant together with it. You cannot promise knowledge, 
only understanding. 

I distinguish between artificial Artificial Intelligence and natural Artificial 

Intelligence. The former is (1) designed, like a machine, for a specific function, (2) 
trained, tuned, and managed by human mind and therefore (3) “infected” with ready¬ 
made human knowledge. The natural AI is supposed to be launched by human mind as a 
primitive embryonic system with no specific function and then abandoned by its creator 
to feed, grow, and seek fortune on its own. The important idea behind this distinction, 
applicable also to Artificial Life, is that very simple systems can come to being by 
accident, while complex systems never can. Simple acts do not need a creator. 

NOTE: In the previous paragraph I did not intend an allusion to the Bible. I realize, however, 
that there is a pattern: the natural AI shakes off the blissful somnolence after the first bite of 
knowledge. 

Characteristically, the origin of life used to be a very troubling question for physicists 
who considered life mathematically improbable (was Godel’s proof ever probable?) , but 
it is only a technical problem for the chemists whose daily bread is stepwise synthesis of 
new and never seen before complex molecules from simple ones. 

If all this sounds rather grandiloquent, it may be justly disregarded. I am going to present 
here some synthesized thoughts, which, I believe, have some relation to natural ones. 

I have to forewarn that I think and see the world as a chemist, which may be very 
different from the way other people do. My computer “experiments” are neither true 
experiments, nor computer models, nor simulations in the area of thought and speech. To 
be exact, they are just illustrations of my own thinking. 



7 


2. THE NEAN LANGUAGE 


What happens when a two or three year old child listens to an illustrated book “The Three 
Little Pigs ” read by a parent? 

There was an old sow with three little pigs and as she had not enough to keep 
them she sent them out to seek their fortune. The first that went off met a man 
with a bundle of straw and said to him, “Please, man, give me that straw to build 
a house.” Which the man did and the little pig built a house. Presently came 
along a wolf and knocked at the door and said, “Little pig, let me come in.” The 
pig answered, “No.” The wolf then answered to that, “Then I’ll puff and I’ll blow 
your house in.” So he puffed and he blew his house in and ate up the little pig. 


The child might have never seen live sow, piglets, straw, and wolf. Some words, such as 
“poor” or “fortune” could be just sounds of parental babble. While animals and things 
are represented by pictures, some actions may be rendered only by words. One of them, 
like “blow” may be familiar from blowing soap bubbles, others, like “send away” may 
not. 

Let us further modify the situation by placing the child inside a cave where ancient 
humans are right in the process of developing intelligence and mastering language. They 
are telling stories to children and each other. How could their language sound in situ 
nascendil What trace does the story leave in the previously empty mind? What is the 
difference between thought and speech? How could this difference emerge? 

While approaching those questions a few years ago, I called the hypothetic primitive 
language Nean. The name Nean honors the Neanderthals, but does not imply that they 
were speaking it or speaking at all. I believe, however, that early humans did, as does a 
two year old child, who “combines words into a short sentence- largely noun-verb 
combinations ” and only by the third year strings together at least three words. 


CONCEPT OF RECAPITULATION 

Although the universality of the concept of recapitulation—repetition of 
evolutionary stages in individual development (“ontogeny repeats phylogeny”)— 
has been disputed, its general validity is widely recognized. In terms of pattern 
chemistry, recapitulation means that the pattern of complex ideation could be the 




8 


same in all exystems. In terms of embryology it may mean that the order of the 
evolutionary path is partially recorded in the individual genome. We can say this 
about a pattern, not configuration. Individual human life in modern society is an 
example of exystem, too. 

Nean consists of single words (singlets), doublets, and, at most, triplets. Unlike modem 
developed languages, which convert the often branching, non-linear, fuzzy, and 
meandering thought into a linear speech along convoluted grammar rules, Nean is 
straightforwardly linear and, I suggest, allows us to literally hear the thoughts which do 
not depend on any modern or primitive language. Nean opens a window into thinking not 
obscured by the complexity of modern language, which overwhelms even some heads of 
states. It is not infected either with intelligence or with ignorance. Nean is the language of 
simplicity, not complexity. Its chemical flavor consists in a potential for unlimited 
complexity in further evolution. 

The closest “adult” approximations to Nean are pidgin languages and early stages of 
mastering a foreign language by adults. 

At the opposite end of the scale of complexity we find the language of philosophers and 
most writers, as well as scientists, which would be impossible without writing and ability 
to keep long strings of words on paper or computer screen for an indefinite time, all the 
more, edit them. 


SIMPLICITY IN NATURE 

Simplicity means building complexity from a simple beginning through a 
sequence of very simple steps fitting a limited number of patterns. Related 
mathematical objects are recursive functions, complete mathematical induction, 
and, especially, fractals. The entire enormity of chemical structures, from many 
monomeric natural components of plants and animals to the monotonous 
complexity of biopolymers, consists of simple elements and can be generated by 
simple operations of synthesis, predominantly, within the environment of a single 
atom or two, and essentially reduced to breaking or locking up a few bonds. 

Organic chemists, as also, probably, architects, strongly prefer visual perception 
to verbal communication. As a chemist, I favor demonstration over explanation. 
Figure 1 presents three substances found in plants. Chemists place them all in one 
class: terpenes (from turpentine). Terpenes and their derivatives terpenoids (e.g., 
camphor, menthol) fonn a wide, motley, and nice smelling variety, see Gallery . 
What can they have in common and how could plants be so chemically 
sophisticated as to produce them? 

The carbon skeletons of terpenes (and natural rubber) are built of the same 
relatively simple isoprene unit: five carbon atoms with a fork at the end. They are 
all related by chemical origin and properties and can undergo dramatic, dizzying 
rearrangements of their skeletons which turn one terpene into another. Steroids, 



9 


produced by both plants and animals, have the same isoprenoid regularity of the 
skeleton. A single simple principle explains the diversity and its origin in nature. 
To borrow Richard Dawkins’ metaphor of Blind Watchmaker, the nature, having 
once invented a single isoprene gear, found a way to all terpene and steroid 
“clockworks,” some of which endow humans with the joy of sex. 





Phytol 



Figure 1: Simplicity behind complexity. Terpenes are a richly 
diverse class of natural components of plants, all built of the 
same isoprene unit 


From pattern-chemical view, both thought and language emerged from the once invented 
single connection in the nervous system: A—B . 

I want to draw attention to the essence of pattern perception of the world: it transcends 
the borders between disciplines of knowledge and different domains of reality. The 
molecular chemistry of terpenes and steroids is of no significance whatsoever for 
linguistics and vice versa. When, amazed by the richness and diversity of the world, we 
try to understand how it all is related and how it all was “created,” we need some 
vantage point from which it all can be seen. This point has always been in the realm of 
poetry as art and is what Pattern Theory offers as science. 








10 


3. LINEARIZATION: FROM THOUGHT TO SPEECH 


Nean starts with the smallest nuclear particle of complexity: a bonded pair. 

I translate the pig story into the “classic” doublet (two-word phrase) dialect of Nean: 

Sow is old. Sow has pigs. Pigs are three. Their names are: First is pig. 
Second is pig. Third is pig. Sow is poor. Sow sents. Sents first. Sents 
away. Sents seek. Seek fortune. First meets. Meets man. Man has 
bundle. Bundle of straw. First says. Says to man. Man give. Give 
straw. Give to First. First builds. Builds house, Wolf comes. Comes to 
house. Wolf asks. Asks First. Asks to enter. First says. Says no. Wolf 
blows. Blows house. Wolf eats. Eats First. 

I have added some words and endings in small print just to illustrate what Nean ignores, 
but I will further do without them, like the standard written Hebrew and Arabic do 
without vowels. 

Nean is understandable not from a podium but in a context of a concrete situation. Its 
only rule of grammar is that the doublet can be ordered, so that first ask and ask first 
may have different meanings in a particular local dialect of Nean. 


I advise considering words of Nean simply as symbols, and I would prefer pictograms 



j | j * 

pig, Jj man (male, “the one who toils in the field” ), and T0, fortune, are even better 
because they are mind-sterile for a non-speaker of Chinese. Numbers are the best, but 
only for computers talking to each other. I will use English words instead of numbers, 
although my computer program operates with numbers assigned in a particular order, 
which I will further describe. 

The phrases of Nean are singlets, doublets (“classic, pure” Nean) to which the “Late” 
decadent Nean adds triplets. The triplets are fonned by a transformation rule which is a 
kind of “pattern-chemical” reaction: 

wolf—eat + eat—first -> wolf—eat—eat— first -> wolf—eat— first 


11 


Triplets and doublets can be further polymerized into longer phrases, but what for with 
no time on a hunt or too much of it in a cave? If you get used to Nean, it sounds 
remarkably expressive and quite capable of nurturing its cave Hemingway. 


COMPLEXITY OF LANGUAGES 

In post-Nean era, agglutinative languages (like Latin and Russian) acquired a 
freedom of word order, which they much later used for expressing fine, even 
redundant, nuances of speech. Probably, all languages at an early stage were 
synthetic, richly inflected and complicated, as opposed to analytic ones. The Old 
English was more complex and had a looser word order than the modern English, 
the Old Church Slavonic was even more complex than Russian, and Sanskrit was 
much more intricate than its posterity. 

Why could complexity of language, supposedly, evolve through an explosive 
growth and then subside is an interesting question. I believe the answer might be 
found in the hypothetical area of linguistic thermodynamics, which would 
approach language evolution as any natural process (not a completely new idea 
after all principles of economy, least effort, etc.). General laws of nature are the 
only reliable guides into extinct past. 

We access extinct languages through their written fossils. Writing is a natural 
jungle-like habitat for complexity. Tribal polysynthetic languages, like the Navajo, 
which beats all records of complexity, typically developed in relatively stable, 
insulated, or widely dispersed types of environment (large planes, mountains, 
islands) in tribal societies with little mixing. A writer is also a loner, as compared 
with the buyer-seller pair at the marketplace. We can only speculate about the 
evolution of linguistic complexity, however. Some new insights into this problem 
could come form studying the relative complexity of language, i.e., the ratio of 
complexity of language to socio-economic complexity, taking to account the 
socio-economic temperature. We might find then enough simplicity in Navajo and 
the Inuit dialects. 

Of course, I express my opinion not as a linguist, but as a pattern chemist. The 
keystone principle of pattern chemistry, fonnulated only in 1970’s (Ilya Prigogine) 
is: the processes in living nature, human history included, are radically different 
from the processes in inanimate nature. The exystems (organisms, societies) stay 
far from equilibrium, while natural inanimate systems are either as close to it as 
possible under circumstances or slowly moving toward it. Besides, ethnic mixing, 
the pattern counterpart of physical heat and chemical stirring, destroys stagnant 
communities and any melting can only simplify a system. 

Probably, the principle of stringing words together, invented with Nean, turned 
out too successful: everything on hand was glued together. Here is an example 
that fascinates me, see Table 1. 



12 


Among blue objects, in Russian, flag is masculine, its English equivalent of 
personal pronoun is he. Sea and sky are neutral, it, and cup is a she. 

Table 1: Parallelism of personal pronouns and adjectives in the Russian 
language. 


English 

Russian: declension of “he” and “blue flag” 

he 

on (he) 

siny flag 
(blue flag) 

OH 

CHHHH (|)Jiai 

to him 

yemu 

sinemtl flagu 

eMy 

CHHeMy (jmary 

about him 

o nem 

o sinem flage 

o hcm 

o chhCM (ftjiare 

with him 

s nim 

s sinim flagom 

C HHM 

c chhHM (fuiaroM 

without him 

bez nego 

bez sinego flaga 

6e3 Hero 

6e3 CHHero (jmara 


Why in the world should the endings of adjectives copy the endings of personal 
pronouns? I see it, half-seriously, but not half-jokingly, as a relic of Nean: 

English: I go to blue sea 

Classic Nean: l-go go-to to-it it-to to-it it-blue blue-sea sea-to it-blue blue- 
to to-it. 

For some reason, the spoken language tends to eliminate the doubling of syllables in 
words. This phenomenon is called haplology (“saying a half of it”), which the term itself 
illustrates: haplology. Haplology of “haplology” would produce “haplogy,” but writing 
preserves the longer fonn of an exotic word. If “haplology” were a common fruit, it 
would collapse into “haplogy.” Probably, a longer word takes more time to say and this is 
why a compressed version wins the competition with the longer one: it is, in chemical 
parlance, more stable, or, in physical parlance, has lower energy, and, in pattern- 
theoretical parlance, is more probable. 

Evolution of language, as any natural evolution, is a never-ending walk to resting places 
of stability through the rough and uncertain terrain of transition states. This, of course, is 
only a hypothesis and we have to wait for somebody to develop thermodynamics of 
spoken language to see if it makes sense. It certainly makes as much sense to say wolf- 
eat-first instead of wolf-eat-eat-first as to say laundromat instead of laundry 
automat even if you are not in a hurry. 


REDUPLICATION 

Accidental doubling of syllables in haplology does not carry any semantic or 
grammatical load. On the contrary, reduplication, which on the surface is the 




13 


same, carries a function. It can be an expression of plural, as in Indonesian 
( orang-orang : people), or of a some special degree of quality, as in emphatic 
colloquial Russian ( bely , white, bely-bely , perfectly, amazingly white, idet, walks, 
idet-idet, walks for a long time, especially in folklore) or standard Thai (nan, fat, 
uan uan, very fat). So, regarding the pig story, eat-eat could be retained for 
expressing either speed of eating or its shocking cruelty. 

Nean is the language of thought in the sense that it expresses the thought directly by an 
ordered linear string of words. More exactly, Nean preserves the topology of thoughts, so 
that the speaker literally speaks his mind. There is, however, an important and instructive 
chemical complication when we attempt to linearize doublets into triplets. 

How can we combine, compress, and line up the thought represented by the following 
pairs of doublets or even all four of them? 

sow poor , sow sent or sent away , sent seek 

Between the initial state of the thought consisting of two doublets and the final linear 
triplet of speech lies the nonlinear—because of the word order—transition state. 


sow poor 

+ 

sow sent 

INITIAL STATE 


SOW 


poor 

sent 

TRANSITION STATE 




? _ ? _ ? 

FINAL STATE 


We cannot linearize it by the rules without either reversing the word order (poor sow 
sent or sent sow poor) or breaking up one of the doublets (sow poor sent or sow sent 
poor). 

This is exactly the point at which the evolutionary ways of thought and speech must 
diverge. 

Thought is a state of mind, a mixed bag (not the “bag” of set theory) of singlets and doublets 
(can of wonns is a better metaphor). There are different ways of linearization and 
different languages deal with the problem differently: sow poor sent can be standard in 
French (La truiepauvre a envoye le cochon), and poor sent sow is not good, but still 
possible in Russian, something like: “Disappointed, went she home.” Isn’t it possible to 
say in English, “School is almost done! Oh, so happy away we go to college?” We are so 
excited (abstract temperature of thinking is high) that we do not care about grammar 
anymore: the rules melt down. Neither a natural process nor an artificial one is possible 
without change in energy or entropy or sensitivity to temperature. 



14 


Regarding the mind, we deal there with configurations and patterns which cannot be 
pored from a bottle and put into a flask, like chemicals, in zillions of tiny copies. We 
cannot talk about symbolic transformations, which I call “thinking,” in exactly the same 
manner as about molecular transformations. They are not dynamic systems in physical 
sense. Events in exystems are Heraclitean: singular and happening only once through the 
timeline. Ilya Prigogine formulated this property, controversially, but neatly, as 
indeterminism, not in quantum-mechanical sense, of course. 

Our pattern-chemical explanations by necessity are rather metaphoric and not rigorous, 
but to what exactly degree is an important separate topic which I am not yet prepared to 
systematically approach. 


ENTROPY AND INEQUALITY 

To give some hints, we cannot chase probabilities in unique and singular acts, 
with no actual statistical observations. Entropy without probabilities is hot air, 
even though initially it was a macroscopic thermodynamic parameter of hot air. 
We ask what entropy is and hear “uncertainty.” We ask what uncertainty is and 
hear “entropy.” This is why I, under an approving look of Aristotle, attempted to 
purge entropy from pattern-chemical language in Introduction to Pattern 
Chemistry, substituting inequality of distribution for it. For our purpose, Jini 
coefficient is fine, even though it does not bring a dime. It does not take it away either. 

In the language of “simple reasons,” the non-linear transition state is a higher barrier to 
speech simply because it takes time to make a choice. While the hunter is thinking how 
to warn his partner about a saber-tooth tiger, the unthinking tiger will come to instinctive 
decision much faster. Politicians, expected to be tigers, usually pay small political cash for “dithering.” 

The direction of the post-Nean evolution of language splits: one route goes toward long¬ 
distance fixed word order or affixes (Velcro fasteners between words), the other leads to 
the fixed word order and shorter sentences to preserve it. In pattern-chemical terms, both 
offer catalysts for linearization of thought. Grammar is a typical catalysts: it can be used 
again and again without losing its properties, while slowly evolving further, most 
probably, toward simplification, until literacy brings its evolution to a halt (and texting 
pushes its further). 

However simple it is, the Nean has an obvious potential for increase of complexity. 


Simple thoughts: first meet, meet man. man bundle, bundle straw. 
Composite thought: first meet man bundle straw. 

Synthesis of modern speech as linear composition: 
first meet-s a man with + man with a bundle of (something) + bundle of straw 


15 


first meets a man with a bundle of straw. 

“First meet man bundle straw” is still Nean, but on the way to complexity. The 
original simple doublet Nean version looks understandable enough, but I am afraid to 
infect my model with human predilections. With all the precautions regarding sterility, 
objectively, I can see that the chain of four doublets “first meet, meet man, man 
bundle, bundle straw” is longer than the single quartet “first meet man bundle 
straw ” but the linearization is straightforward: the speaker of Nean has only to speak 
his mind to be understood in context. 

It is different with the poor sow. The synthesis (“de-duplication”) produces a non-linear 
thought, Figure 2. 

After the de-duplication, the non-linear configuration still must be linearized some way. 
This is how diverse grammars emerge and those which ensure more stability at the 
marketplace win. 

My intent, however, is to stick to Nean and try to extract as much 
linearity from it as possible. As a chemist, I want to fill up a beaker 
with words and see what kind of chemical reaction can happen 
between them and whether I can get any composite thought from 
the mixture without the germs and enzymes of my own language 
and intelligence. The computer illustrations, therefore, consist of 
three parts: LISTENING (filling up the beaker), THINKING 
(mixing and linearizing), and SPEAKING (generating linear output). 

The graphic output will even look like a beaker. 


poor 

sow—sent—away 


\ first 
seek—fortune 

Figure 2: Synthesis of non-linear thought 

I do not know for sure whether what I call listening, thinking, and speaking has anything 
to do with the human functions of the same name, but I believe that there is a way to find 
out. 


sow poor + sow sent + 
sow sent + sent first + 
sent away + sent seek + 
seek fortune _ 







16 


I do not think that language is absolutely necessary for thought. I want to give a personal 
account regarding non-verbal thinking of a chemist and I start with a simple illustration. 


Chemistry is a headache for most normal people. This is why I take aspirin as an example. 
When I think about it, I imagine its chemical structure. It is shown in the figure. 

Nevertheless, I do not mentally see the whole picture, 
although I can draw it. I see the three major components of 
the molecule—benzene ring, carboxyl, acetylated phenolic 
hydroxyl, and the way they are arranged around the ring. I 
can say what is in the fonnula: 2-(acetyloxy)benzoic acid . It 
sounds “two-acetyloxy-benzoic-acid”, which is, actually, a 
Aspirin phrase composed with a standardized dictionary and grammar 

developed by chemists for communication between 
themselves. Any organic chemist will understand me, although probably surprised that I 
do not say simply “aspirin.” Frequently used chemical words are all shortcuts and not 
chemically correct names. 



I see the process of verbal “expressing” the structure of aspirin as a linearization of non¬ 
verbal thought. Moreover, when I write the structure, I perfonn a dimensionality 
reduction: I translate the actual 3D structure, not shown here, into 2D picture. I cannot 
give an exact account of what is happening in my head when I think about aspirin (if not I, 
then, for sure, nobody can), but I certainly feel it as an assembly of some fragments and 
features of structure, whether verbal or visual. 


LINEARIZATION IN CHEMSPEAK 

Figure 3 presents a more complicated example. Atorvastatin Calcium (Lipitor) 
has the chemical name (3R,5R)-7-[2-(4-fluorophenyl)-3-phenyl-4- 
(phenylcarbamoyl)-5-propan-2-ylpyiTQl-l-yl]-3,5-dihydroxyheptanoic acid, 
calcium salt. 



Figure 3: Brain surgery of Atorvastatin Calcium 











17 


The name is not even a word but a whole detective story. The tong-twisting 
language makes the name difficult to pronounce, but a chemist understands the 
picture faster than you can say “Jack Robinson.” I am not sure, however, that 
more than a handful of chemists can say it if wakened up at 3 AM. 

When I think of the structure, not just stare at it, I have in my head a soup of 
fragmentary thoughts in no particular order: “fully substituted pyrrol in the 
center” “4-fluorophenyl,” “phenyl,” “isopropyl,” “ co-dyhydroxyheptanoic acid 
(not a correct chemical expression) at pyrrol’s N,” and other finer details, mostly non¬ 
verbal, including the order of substituents. I show by color the correspondence 
between some fragments of the name (morphemes) and the elements of structure 
which they designate. 

Anybody can now find 5R in the formula. Another test for a non-chemist: find 
phenyl. Most probably, you will find yourself thinking non-verbal, except for the 
word phenyl sounding in your head (perhaps not even sounding). 

It is the grammar imposed by International Union of Pure and Applied 
Chemistry which orders my thoughts about the structure and guides me through 
packing them into the long chemical name, if I need to do that. I have to look at 
the structure for that because it is too big for my visual memory. As for the 
fluorine (F), I feel an instinctive aversion to fluorine in drugs, water, and 
toothpaste. I refuse to take Lipitor and prefer a totally different Simvastatin. This 
is how my thoughts, either visual or verbal or irrational, guide my behavior. The 
structure of Simvastatin, by the way, invokes memories of my very first 
postgraduate work on one of its fragments (lactone ring) and starts a new chain of 
thoughts far away in time and place from statins. 


Spoken language is not the only way of expressing a thought. Gesture, dance, ceremony, 
and picture are non-linear ways of communication, too, although some only one-way. 
Sound is not as restricted by environment as vision and speech and, unlike gestures, it 
does not impede work and movement. Writing definitely followed pictures until the 
divergence between letters, pictograms, and ideograms, which survived in Chinese and 
flourish in public signs. As to art installations—I am speechless and so should be you. 

I am repeating these trivialities to put forward the non-linear origin of thinking. Thought 
could be much older than language and not limited to humans at all. What has been 
much less discussed is that the linear language ensures communication between humans 
and things and brings it up to speeds far exceeding natural human possibilities. 

1927 

When thought lags behind speech, electronic economy 
collapses. Do we remember that the Great Depression 
happened in the era of telegraph, telephone, and ticker tape? 





18 


4. LISTENING 


The following text is ready for entering as character strings: 

‘sow old’ ‘sow pig’ ‘pig three’ ‘first pig’ ‘second pig’ ‘third pig’ ‘sow poor’ ‘sow 
sent' ‘sent first’ ‘sent away 1 ‘sent seek 1 ‘seek fortune’ ‘first meet’ ‘meet man’ 
‘man bundle’ ‘bundle straw’ ‘first say’ ‘say man’ ‘man give’ ‘give straw’ ‘give 
first’ ‘first build’ ‘build house’ ‘wolf come’ ‘come house’ ‘wolf ask’ ‘ask first’ 
‘ask enter’ ‘first say’ ‘say no’ ‘wolf blow’ ‘blow house’ ‘wolf eat’ ‘eat first’ 

There are 33 doublets in the text. I allow ‘first say‘ to occur twice, just to see what 
happens, but it is counted, inconsistently, only once in the array of words NAMES. What 
happens will be seen later, but I did not know it in advance. 

There are 28 elementary words (singlets): 


“!’ ’sow’ ‘old’ ‘pig’ ‘three’ ‘first’ ‘second’ 
‘meet’ ‘man’ ‘bundle’ ‘straw’ ‘say’ ‘give’ 
‘say’ ‘no’ ‘blow’ ‘eat’ . 


‘third’ ‘poor’ ’sent’ ‘away’ ‘seek’ ‘fortune’ 
‘build’ ‘house’ ‘wolf ‘come’ ‘ask’ ‘enter’ 


I call the simple algorithm of my “LISTENING” program “SCALE.” The name SCALE 
honors Bourbaki’s “ scale of sets ”. SCALE adds combinations of set elements as new 
elements, remembers their composition, but, more importantly, SCALE has a primitive 
recognition function: it distinguishes between new and old. Its workspace WORLD 
(square matrix WW) initially contains only the “tohu-ve-bohu” (formless and empty) 
element ‘!‘. The total number of elements (words) is, therefore, 62. Programs SCALE 
and, for “THINKING,” PROTO were first described in Molecules and Thoughts (2003). 


In the subsequent illustrations, my input is in square brackets after “TYPE”. The rest is 
the output of the program. I initiate SCALE with typing SC . start initializes the 
workspace, but there is an option to continue filling up the previous one. 

WORLD (WW) is a 62 x 62 connectivity matrix filled up in the order of the appearance 
of words in the text. The words are stored in NAMES. 




19 


» [TYPE sc] sc 

1 to start, 2 to continue [TYPE 1] 1 

enter number of cycles (nn): [TYPE 2] 2 

ATTENTION: Interspaced components should be entered 
between apostrophes as single string of characters 

press any key to continue [ENTER] 

I start with the basic singlets and then load the doublets in the order of their appearance in 
the text. Command link processes a row of matrix WW and lists all words with the 
same element. Here is an example from the middle of the process: 

enter components [TYPE 'sow old’ ] 'sow old’ [ENTER] 

This is new. Name it [TYPE 'sow old’ ] 'sow old’ [ENTER] 


enter components [TYPE 'sow pig’] 'sow pig' [ENTER] 

This is new. Name it [TYPE 'sow pig' ] 'sow pig’ [ENTER] 


»link 

NAME to check 
[TYPE] 'sow pig’ 
sow old 

»[TYPE] NAMES 
NAMES = 

! 

sow old 
sow pig 

Note that the structure of WORLD retains the order of its genesis. I hint here to the idea 
of recapitulation, but I am still far from exploiting it. 

SCALE detects the components of doublets, accepts and assigns a name to a new 
combination or classifies it as old. If the order of components in a possibly new doublet 
is different from what is in WORLD, SCALE remarks Looks like W , gives the 
coordinate of word W in WORLD and asks: ‘Is it new? 1/0?’. SCALE, therefore, may 
reflect or disregard the order of components, which increases its universality. 


If you want to check WORLD for a name, type: li nk 
To see NAMES, type: NAMES 
To display the 3D world, type plotw 


»link 

NAME to check 
[TYPE 'sow'] 'sow' 
sow old 
sow pig 

»link 

NAME to check 
[TYPE 'old'] 'old' 


»link 

NAME to check 
[TYPE'pig'] ’pig’ 

»link 

NAME to check 
[TYPE 'sow old’] 'sow 
old’ 

sow pig 



20 


In short, words , from singlets, to doublets, are connected in WORLD if they share a 
component. Examples: 

sow—sowold 
sowsent—sentaway 
sowsent—sentseek 
firstbuild—firstask 

For the purpose of visualization, all 62 words are arranged in a circle. A sequence of 
stages in building WORLD can be portrayed in 3D. 


CRITERION OF INTELLIGENCE 

One may note that I (or another operator, educator , as I would say) take part in 
building WORLD and in this way infect it with human knowledge. For example, 
the operator decides whether the order of components is relevant. This is true, but 
note that it is done in the manner of a dialogue. Whether education was right or 
wrong, good or bad, can be ultimately decided only in a socio-economic and 
intimate setting, not by the educator. AI, I am sure, can create a good model for 
dialogue, which is the ultimate criterion in much disputed Turing test, but EVE—I 
decide for the first time to call “it” by name—is born in a dialogue. I see a test of 
intelligence as the ability to achieve prolonged stable states inside an exystem, 
not the ability to be taken for a human on the basis of email communication, all 
the more, texting. From this point of view, animals are intelligent, which is 
exactly my point. I do not measure intelligence, but simply look for its presence. 
Probably, it has already been suggested, but I have not found anything except 
Ross Ashby’s homeostasis. Is humankind intelligent? We need more time to 
think about it. (Answer: there is no such thing as humankind). 


Figure 4 shows how SCALE connects the bond between sow old and sow pig because 
of the common component pig . 

After the entire text is acquired, its internal representation looks like in Figures 5,6, and 
8 . Figure 7 shows a (25:35) fragment of WORLD on both sides of the borderline 
between the first 28 singlets and the following (29:62) doublets. The singlets do not have 
any bonds until their entries into doublets appear from the later doublets, not toward 
them. WORLD takes heed of the arrow of time. It can look back, but not ahead. 

New singlets can be acquired at any time, of course, and I placed them all in the 
beginning to imply that the listener to the story is already familiar with all words. This is 
not a necessary condition, however. 



21 




blow 
eat 
sow old 29 
sow pig 30 
pig three 31 
first pig 32 
second pig 33 
third pig 3,1 
sow poor' 
sow sent' 
sent first 


,.hous5 ive say 

wolf blJ, d , 20 « 18 

come 22 
ask 23 
enter 24 
25 

26 


c ,„bundle meet 
slravr man 


fortune 

seek 

13 „„ away_ 
11 


sent 


poor 


third 


second 


three 


eat fust 


wolf eat 


sent away 
sent seek 40 
seek forturffe 1 


first meet - 43 
meet man, ,44 
man bundle 

bundle straw 
first say 
sa 


46 47 48 

man give 
say man gjve 

J_l_ 


straw^Bulfd 

s R fl,sl 

_1_ 


■ VM 


■:t bui 


blow house 
wolf blow 
say no 
ask enter 
ask first 
54 wolf ask 
come house 
come 


1 

0.8 

0.6 

0.4 

0.2 

0 

- 0.2 

- 0.4 

- 0.6 

- 0.8 

•1 


- 0.8 


- 0.4 - 0.2 


Figure 5: 2D view of WORLD. 






















































22 



Figure 6: 3D view of WORLD ordered 
along the timeline of acquisition (axis Z) 




Figure 7. Fragment WW(25:35) in 
process of buildup and the border 
(red) between singlets and doublets 



Figure 8: Imitation of stereoscopic view of WORLD 

The READING stage can be extended indefinitely by adding new entries of WORLD on 
top of the old ones. Note, that during the READING, bonds go not from past to present, 
but from present to past. Therefore, the deeper and older areas of WORLD are being 
enriched by new bonds induced by new acquisitions. 
































23 


Building of WORLD does not require operator’s intelligence (it can be automated) and 
does not serve any utilitarian behavioral function typical for AI. It is not difficult to 
imagine further steps toward realism, however. Thus, in principle, WORLD can—and 
should—react to any external stimulus by activating SCALE. We can also impose 
gradual fading of old areas, if desired. Next, we make WORLD fractal in an irregular 
manner. 

I am trying to design not a mind, however, but a possible launching pad toward its 
emergence. All I expect from the intellectual newborn is to start feeding on new 
experience and digesting the foodstuff. It is not learning in the sense the dogs and 
traditional AI systems are learning to bring you a pair of slippers. The only function of 
the model is to grow and think. 

The model (or, rather, the seed of a model) outlined here needs a name. I confirm the 
name EVE because Eve was the first to taste the apple and she liked the taste. 



24 


5. THINKING 


The next stage is the pattern-chemical process of “thinking.” It consists in generating a 
sequence of winners in a competition of singlets and doublets for, metaphorically 
speaking, ascent into “consciousness” in which, in our experiment, only one thought can 
be a star. 

Consciousness is a serious business and I do not want any extra controversy. I call the 
selected or appointed thought acton and give it the star status marked by asterisk. 

THINKING starts by designating the initial thought in consciousness, although it can be 
chosen at random. All next actons are selected along the probability distribution 
calculated before selection. The core of THINKING is a modified procedure PROTO , 
script p3d3. There are several versions of the unit select which calculates the probability 
distribution. 

The selection of acton is based on a modified idea of the pioneering work of Manfred 
Eigen, initiated in the 1970’s and still progressing. It would take a lot of space to explain 
my choice of computation for the stage of thinking. Darwinism and neural realism are 
two of them. I can afford only a short digression. 


MANFRED EIGEN AND MOLECULAR EVOLUTION 

To say the least, Manfred Eigen’s ideas, inspired by Darwin, but essentially 
chemical, are so general that they have put roots in areas as different as origin of 
life and origin of language (Martin Novak). Moreover, the readers (and authors) 
of the avalanche of literature triggered by the Ising model of ferromagnetism 
belong under the same conceptual umbrella, probably, unaware of that, because 
the equations are, in principle, of the same type, better to say, pattern, as 
Manfred Eigen acknowledged. 

REFERENCES: 

Leuthusser, I. (1986), An exact correspondence between Eigen’s evolution model and a 
two-dimensional Ising system, Chem. Phys., 84, 1884-1885. 

Eigen, Manfred, The physics of molecular evolution; In: Molecular evolution of life; 
Proceedings of the Conference, Lidingo, Sweden; 8-12 Sept. 1985. pp. 13-26. 1986 



25 


The core of the idea is that the state of the topological neighbors (in not 
necessarily Euclidean space) influences, but not predetermines the state of a 
node of a network. 

The graph of the network is a subgraph of a full graph on all nodes (as any graph 
is). This is why I arrange the nodes of the network in a circle: each element can 
potentially connect with any other. A non-mathematician, I intuitively anticipate a 
specific kind of semi-regularity in the graphs which I draw here. They are not 
exactly point lattices, but the limits on the valency (degree) of vertices (arity of 
generators in Pattern Theory) prevent combinatorial explosions. This is the 
essence of thinking: stay focused. I would say, vaguely, that exystems are severely 
restricted subsets of the scale of sets. 

Unlike the models of interactions on lattices, Eigen’s original chemical model 
does not come to equilibrium. It is driven by non-stop replication, which in 
chemistry requires thennodynamic openness. At the same time it is cruelly 
conservative: the number of atoms is constant, although this can be modified. 

Eigen’s initial model of chemical evolution does not contain a single chemical 
symbol and remains, I believe, an excellent example of what I call pattern 
chemistry. The concentrations, used in chemical equations, are nothing but 
probabilities to find a species in a unit volume. 


The only computational unit of PROTO—selection—calculates the probability 
distribution at time t+1 as: 


Pi' +1 =(p/+ Zi,j G i,j Pj - F pi ), 

where F defines forgetting or dissipation and ^ j j Gf -j pj j s a sum 0 f influences 
on cell i over all neighbors j in the network. Parameter G can be different for different i , 

j • 


There are additional rules interpretable as “the ping-pong player cannot send two balls in 
a row, but only in turn with the opponent.” 


1. The acton A t selected at time t cannot be selected in time t+1 . 

2. The acton A t selected at time t “remembers” its probability at time t+2 as C , 
although it cannot be selected again until time t+2. 

3. At time t+1 , the neighbors perceive the probability of acton Aj as H‘C . 


All constants, F, C, G, and H, = < 1, although it is not necessary. In principle, instead 

t A t 

of Pi , I could (and even should, following Eigen) use A Pi , A > 1, to reflect 



26 


replication, but neural realism forbids it. I believe that parameter C sufficiently accounts 
for a quasi-replication because it increases the subsequent probability of selection. This 
aspect of thinking needs a separate discussion. See CONCLUSION for some specifics. 

Further in the process of selection, the calculated distribution is normalized, probability 
of previous acton and empty word ‘! ’ nullified, and a random number cast for selection. 
The nullification of ‘! ’ is also optional because the empty word may represent a sudden 
loss of the train of thought for whatever reason. 

The remarkably simple premises of SCALE and PROTO open a passage to an 
extraordinary complexity of choices and possibilities to make the model realistic. 

A digression on neural realism follows. 


NEURAL REALISM 


The tenn itself has been used, obviously, in the area of simulated neural activity 
and cognitive models. Regarding PROTO, there are at least two considerations in 
the gray area of neural realism. 

1. Neurons do not grow in numbers the same way organisms, molecules, and 
money do. Brain is a conservative system regarding matter. However intensely 
somebody is thinking, we cannot see any brain coming from his ears and we see, 
actually, nothing growing or even changing in any sense. The activity, of course, 
is detectable with proper equipment, but are we sure that we see thinking? We 
actually see metabolism, which is not growth, but dissipation. 

The brain uses energy to maintain the neurophysiological activity, which is 
neither a two-way exchange, as in economy, nor a physical collision as in 
molecular chemistry, nor a physical multiplication, as in replication of organisms 
and nucleic acids. Neural events are impulses sent to neighbors in a network. 
Therefore, Eigen’s equations could not be applied either in their initial fonn, or in 
their deep replicative meaning. What remains from the idea is competition for a 
limited resource, common for molecules, money, and organisms, as well as for 
all brain cells (I mean energy) and even their brain owners (I mean “fortune”). 

The event in PROTO is the one-time “contest” which results in a winner. As with 
humans, the participants can try their luck again and again. I do not mean that the 
competition in the mind is global and it elects a single winner. It happens, 
probably, all over the mind in a fractal manner. 

2. Selected actons cannot retain their status at the next selection. The presence of 
refractory phase (inactivity after firing) in neuronal activity is the main source of 
realism in this regard. I can only guess, speculatively, that this cardinal, 
biologically unavoidable fact of natural neuronal behavior was a condition for 



27 


the development of the mind, and not in any sense a limitation. This condition 
makes neural dialogue possible. Marvin Minsky’s “Society of Mind” is more than 
an apt metaphor. 


AA= [ 

46 

52 

10 

39 

41 

56 

39 37 

41 

45 

41 

48 

38 

19 

37 

52 

41 

56 

32 

6 

38 

52 

38 

41 

38 

42 

62 36 

52 

6 

32 

52 

37 

62 

32 

62 

37 18 

37 

56 

49 

62 

49 

10 

19 

62 

19 

62 

41 

46 

41 

46 

16 

6 

57 

45 

62 6 


41 

62 

42 

49 

32 

49 

41 

33 

41 

52 

6 

52 

61 

30 

6 33 32 33 58 62 


49 

58 

37 

52 

41 

46 

62 

46 

32 

47 

62 

33 

49 

34 

58 

19 

58 

31 

46 

62 

50 

56]; 










-1 - 0.8 - 0.6 - 0.4 - 0.2 


0.2 0.4 0.6 0.8 1 


A. Trajectory as a sequence 
of actons 


B. Distribution of actons 
over NAMES 



C. 3D trajectory. Z=l:100 D. Connector of thought. Z=l:62 

Figure 9. Trajectory AA and its graphic outputs; n=100, F=0.2, C=0.5, 
H=0.8, G=0.1, initial acton a=42, ‘meet man’ 






























28 


The output of “thinking” is the trajectory: a sequence of the actons which are elevated to 
star status. To say “consciousness” would be inappropriate here; let us say instead, 
“borderline area”. 

Figure 9 shows the graphic outputs of 100 cycles of “thinking” (i.e., selection of 100 
actons) after the initial acton is enforced as doublet 42, ‘meet man’. In Figure 9B the 
position of the asterisk which represents an acton is slightly randomized. 

Script nums (from “numbers”) generates the activated thoughts along with their 
occurrences (column 2, bold font) and the Jini coefficient of the distribution of 
occurrences at the end: 


» nums 

48 1 

give straw 

33 4 

second pig 

52 8 first build 



50 1 

straw build 

38 4 

sent away 

41 11 first meet 

16 1 

bundle 

57 1 

ask enter 

56 4 

ask first 

62 12 eat first 

18 1 

say 

61 1 

wolf eat 

58 4 

say no 


30 1 

sow pig 

10 2 

sent 

6 6 

first 

total 28 

31 1 

pig three 

39 2 

sent seek 

32 6 

first pig 

JINI = 0.596 

34 1 

third pig 

42 2 

meet man 

37 6 

sent first 


36 1 

sow sent 

45 2 

first say 

46 6 

say man 


47 1 

man give 

19 4 

give 

49 6 

give first 



Thus, say (18) and sow pig (30) activate once each, while first meet (41) and eat first 
(62) appear 11 and 12 times within the 100 acton sequence. I am tempted to say that 
“consciousness” has depth and first meet and eat first are close to the surface, while 
bundle to first say sit at the bottom. Actually, I am quite serious. Earlier I proposed to 
consider consciousness as a cognitive mechanism necessary for linearization: one link of 
the chain of events at a time. To put it differently, consciousness is a window into the 
world, narrow enough to see the forest as a set of trees. Consciousness is a condition of 
cognitive analysis of the world and the synthesis in the form of thinking. 

The output of script nums is similar to content in Ulf Grenander’s GOLEM, while 
matrix WW corresponds to its connector. It is a small bowl of thought soup, in which 
we see pasta pieces of different length and quantity. 



29 


6. SPEAKING 


At the stage of SPEAKING we already have the right to judge whether the output makes 
any human sense. We cannot hear the shrieks of thoughts being stretched on the 
Procrustean bed, but speech escapes the dark dungeons of the mind. 

Script say_con (“say content”; later changed for saycon2) generates all possible 
triplets from doublets along the Velcro rule of haplology. The program compiles a 
content condensate: list of doublets and triplets of Late Nean, which is a transitional 
state to full language. The condensate is rather focused, which is expected, but still 
surprising. It has a potential to be even more condensed, as I will further illustrate, 
thereby producing longer thoughts and opening a way to full human language, breaking 
through a crucial transition state. 

Here is an example of a trajectory, its content (third column shows number of 
occurrences), and its condensate: 

Trajectory: 

AA = [46 52 10 39 41 56 39 37 41 45 41 48 38 19 37 52 41 56 32 6 38 52 

38 41 38 42 62 36 52 6 32 52 37 62 32 62 37 18 37 56 49 62 49 10 19 62 

19 62 41 46 41 46 16 6 57 45 62 6 41 62 42 49 32 49 41 33 41 52 6 52 61 

30 6 33 32 33 58 62 49 58 37 52 41 46 62 46 32 47 62 33 49 34 58 19 58 

31 46 62 50 56]; 


Content: 

16 1 bundle 
18 1 say 

30 1 sow pig 

31 1 pig three 
34 1 third pig 
36 1 sow sent 

47 1 man give 

48 1 give straw 


50 1 straw build 
57 1 ask enter 
611 wolf eat 
10 2 sent 
39 2 sent seek 
42 2 meet man 
45 2 first say 
19 4 give 
33 4 second pig 
38 4 sent away 


56 4 ask first 
58 4 say no 
6 6 first 
32 6 first pig 
37 6 sent first 
46 6 say man 
49 6 give first 
52 8 first build 
4111 first meet 
62 12 eat first 



30 


Content condensate: 


give_first_meet; 
give_first_pig; 
give_first_say; 


sent_first_pig; 
sent_first_say; 
sow_pig_three ; 
sow_sent_away; 
sow_sent_first; 
sow_sent_seek; 
third_pig_three ; 
wolfeatfirst; 


eatfirstbuild ; 
catfirstmcct; 
eat_first_pig; 


give_straw_build; 
man_give_; 


man_give_first; 
man_give_straw; 
meet_man_give ; 
say_man_give ; 
sentfirstbuild; 
sentfirstmeet; 


eat_first_say; 


firstmeetman; 
firstsayman ; 
first_say_no ; 


give_first_build; 


Longer regular strings can be velcroed together from the content: 

first say man give first pig 

Isn’t it “First say[s] to man [:] give [something to] first [who is a] pig? “ (Just a 
question). 

first meet man give first pig 

Isn’t it “First meet[s] man [who] give[s something to] first [who is a] pig?” (Just a 
question!) 

A longer and bolder derivation: 

firstsayman ; $ay_man_give ; man_give_straw ; give straw build ; -> 

First say, “Man, give [me some] straw [to] build [a house].” 

And don’t you see that there were three pigs in a family: first [was one] pig [out of] 
three ; second [was one] pig [out of] three ; third [was one] pig [out of] three ? 

Still, the question needs an answer. In the context of the pig story told in Nean, the 
answer is: 

Our interpretation is correct if it does not lead to instability with fellow cavemen. 

This is crucial for understanding the emergence of language, but again, it is a separate 
topic. I believe, the book by Nikolaus Ritt, Selfish sounds and linguistic evolution. A 
Darwinian approach to language change (Cambridge University Press, 2004), should 
tackle this subject, but 1 have not yet read it. 

Remember that the whole trajectory of THINKING in the last example was triggered by 
initial acton ‘meet man’. Our baby mind seems to stay focused on the topic. The tragic 
end of the hero is not forgotten, however: wolf_eat_first. 






31 


But what is the meaning of the common element in the non-linear clusters with the same 
beginning or end? 


first meet man ; 
first pig three ; 
first_say_ ; 


meetmangive ; 
say man give ; 


eat first pig ; 
sentfirstpig ; 
give first pig ; 


For the answer I have to refer to my e-publications on extraction of grammar by 
simplistic rules: Salt and Salt2. 

The short answer is: common elements in the same position define a category of 
grammar: first for subject, give for verb, and first pig for object. 

As soon as the words are categorized, the emerging grammar attributes either order or 
inflections, or both to them. Thus, first is a generalized symbol for all things which can 
meet, be, and say; give is something that at least man (and others, as will be seen with 
more experience) does; first_pig symbolizes everything that can be eaten, sent, and 
given to. 

The second interpretation opens a way to vertical linearization by common beginning: 

man give first; 

. -> man give first straw or man give straw [to] first 

man give straw ; 

Indeed, haplology is just a particular case of contraction. 
sow_sent_away; 

sow_sent_first; -> sow sent first [pig] away [to] seek [fortune] 
sow sent seek; 

There is also a third interpretation: the topic, as in Japanese and some atypical cases of 
English and other languages. 

The modem language begins. 

Leaving the cozy simplicity of Nean, humans face a great uncertainty of the choice of 
grammar. Would that be too hubristic to explain the overwhelming diversity of human 
languages by their common origin from the extreme simplicity of Nean? By starting from 
a point on a plane, you can move in any direction, but you have a tree of more narrow 
choices after the first step. Biological diversity illustrates the principle of “more 
complexity—less choice” best of all. A leopard can change its spots, but cannot get rid of 
its vertebrae. In the spirit of pattern chemistry, this applies to any evolution, from life 
(single cell) to politics (ambition) to Internet (fun of function) to terrorism (fun of killing). 



32 


Formally, human speech is just a particular kind of performance, whether improvised or 
memorized, but usually both. Regarding memory—not less mysterious problem than 
origin of the universe—here is a digression on the subject. 

MEMORY AND LINEARIZATION 

Professionals in any area are capable of keeping in memory sequences of words, 
numbers, movements, and other apparently non-linear configurations as large as a 
symphony, part in a long play, big chemical formula, epic poem, long 
mathematical proof, history of a nation, assembly of a device, etc. I suspect that 
this is possible because of a fast assembly of linear fragmentary thoughts with a 
hierarchy of Velcro fasteners to put them out as speech or action. It is also 
possible that long time memory is characterized by an especially conservative set 
of parameters in Eigen-type equations. 

This is where Noam Chomsky’s universal grammar comes to mind. If there is 
anything that could be qualified as universal grammar, it is the inborn ability of 
fast assembly of a linear behavioral configuration from short linear fragments of 
thought. The language instinct is to linearize the content of the mind. 

There is no originality, especially, after the work of Eric Kandel, in saying that 
the elementary unit in pattern chemistry of memory is simply a bond between 
two physiological carriers, neurons or not. There is a subtler aspect of the 
problem, however. 

A movie production is a good, albeit too literal, metaphor of assembly of a long 
sequence from short fragments shot along the linear script in often broken and 
non-linear order, so that death could be filmed before birth and divorce before 
marriage. 

The performance of an actor on stage is unrolling in the natural time and strictly 
linear order, it cannot be cut into confetti, as a mortal combat in a violent movie. 

It appears non-linear, however: the actor (1) speaks and accompanies words with 
(2) emphatic intonation, (3) gestures, (4) body part movements, (5) facial 
expressions, (6) silent action, (7) manipulations of objects and partners. At a 
closer look, however, this complex multidimensional sequence of configurations 
is a bundle of parallel linear actions, which I have just enumerated. It is kept in 
order by the script. 

A symphony is a similar, although more transparent, fiber bundle, visualized in 
the score. Each musician, in turn, performs a mini-symphony of smaller linear 
parts played by fingers, arms, and lungs. All individual strands of actions are 
organized by rhythm. 

I use this digression to emphasize the contrast between the current paradigm of 
artificial Artificial Intelligence and its possible alternative, natural Artificial 
Intelligence. Natural thinking and behavior emerges neither from a likeness of a 



33 


computer code, nor from the equations of chaos theory, but from random but 
organized pattern-chemical interactions and transfonnations, each with its energy 
cost. Yet computer (non-parallel), like human mind, linearizes the non-linear ifs 
and loops of the code. In this sense, computer speaks without thinking. So, don’t 
be fooled. 



34 


7. FROM SPEECH TO UNDERSTANDING 


SPEAKING is not supposed to fall on deaf ears. Although SPEAKING of EVE sounds as 
a crude pidgin English, it is addressed to us, listeners who know the context of the story 
and probably even have seen a pig and a wolf live. Therefore, we are free to evaluate 
SPEAKING without the guilt of excessive anthropomorphism. This time we start a new 
turn of the spiral started by the cave stories as listeners. 

We UNDERSTAND speech because it is ordered by the 
way it is generated by the speaker and is passing through 
the relatively wide and fading toward the periphery 
spotlight of consciousness in the same order. 

The old paper tape telegraph is an appropriate ideogram of verbal communication. What 
it lacks is the “bum after reading” procedure performed by the ephemeral sound of 
speech on itself. This cardinal difference between reading and hearing was the point of 
divergence between spoken and written languages. 

In this section I will present some observations of a listener on the behavior of EVE. I 
am trying to understand what EVE is saying and how it can help understand the origin of 
thought and language. 

By no means I consider EVE a linguistic computer model, whatever I say—for lack of 
better terms or for excess of enthusiasm—about it here. EVE is just a means to generate 
illustrations of the principles of evolutionary simplicity, which might well apply to any 
other evolution toward a high complexity, for example, DNA, organism, society, or 
science. Therefore, the words “behavior of EVE,” which I will use, mean literally the 
output of some software designed to produce illustrations. 

The reason why I want some difficult to achieve clarity in this regard is my skepticism 
concerning computer models of exystems. I will return to this point in CONCLUSION. 

By protolanguage we may mean at least two hypothetical things. One is the very first 
spoken language, die Ursprache, with its lexicon, and the other is the structure of the first 
language, regardless of the lexicon and phonology, which is what I mean. It is difficult to 
achieve consensus on something as hypothetical. Whether Nean is a protolanguage or not 
a language at all is a matter of definition, which I will put aside. 

Thought is not as hypothetical as protolanguage, but it is even more evasive. There is 
something going on in the mind, which we call “thought” (inner thought, process) but 
never see anything but some brain imaging shadows. There is also thought as the 
meaning of speech or written sentence (outer thought, content, idea), which can be etched 
in stone, but from Shakespeare to Sartre to Sarah Palin, to achieve consensus on what 



35 


they were thinking before uttering a phrase is not always an easy task. Besides, as 
Chisholm’s Second Corollar [to Murphy’s law] says, “If you explain so clearly that 
nobody can misunderstand, somebody will. 

As a pattern chemist, I see the inner thought as a “soup” of fragments and the outer 
thought as their spoken linearization as it is understood and can be translated and 
explained. Instead of “soup” a chemist would say: liquid mixture of short elementary 
thoughts and longer clusters in motion, i.e., physical suspension or solution (strangely 
cerebral opposites). “Soup,” however, is a standard term in any discussion of chemical 
origin of life. It is a typical ideogram. 

The linear topology, however, is not enough for understanding: thought has to carry a 
message (thought, message, idea, content—so confusing!). Moreover, thought can carry a 
message relevant only in a context of a particular situation (a saber-tooth behind partner’s 
back or a philosophy class) or a timeless message retold through generations, like a myth, 
ancestral lineage, or a real story about the hunter and his partner still remembered by 
older members of the tribe only because it was extraordinary. Message requires tags, like 
time and place, which asks for a larger and consistent thought, or, rather, a story. 
Language develops from signal to story. 

The behavior of EVE shows some distinct patterns, depending on the parameters C, G, 
H, and F . Regarding selection, the calculation unit (script select3.m) takes to account 
the arity Ar of the NAME ) by dividing the sum of influences from the neighbors by 

V Ar . Probably, In Ar instead of yfAr should also be tested. This kind of choice is 
intuitive and I cannot yet find any rationale for it. Random mutations of the algorithm 
would be the best. 


Several types of graphic output are shown next. The runs begin with initiation 
(equalization of probability distribution), initial acton NAME 53 (wolf come), and 150 
subsequent cycles in batches of 25, of which only the 25 last are shown here as 
trajectories in WORLD space, ordered along axis Z from 1 to 62. The actons can flock 
together either in the last section of the WORLD (upper part of the cylinder) where wolf 
appears (Figures 10-1 and 10-9, or, in addition, heavily refer to the 1:28 section of single 
NAMES (Figures 10-8 and 10-10), or are highly condensed, as in Figures 10-7, and 10- 
8 . 





1: C= 0.5; G= 1; 
H = 0.8; F =0.8; 


2: C= 0.5; G = 1; 
H = 0.8; F = 0.2; 


3: C= 0.5; G = 1; 
H = 0.2; F =0.2; 


4: C= 0.2; G = 1; 
H = 0.2; F =0.2; 










36 



5: C= 1; G = 1; 
H = 1; F = 1; 




6: C= 0.1; G = 0.1; 
H = 0.1; F = 0.1; 


7: C= 1; G = 0.1; 
H = 1; F =0.1; 


8: C= 1; G = 0.1; 
H = 1; F = 1; 




9: C = 0.2; G=l; 10: C = 0.2; G=l; 
H= 1; F = 0.5; H= 1; F =1; 


disp(' 1 : C = 0.5; G = 1; H = 0.8; F = 0.8;') 
C= 0.5; G = 1; H = 0.8; F= 0.8; %1 
init, n=25 , disp ('init'), a=53; 
p3d3s, disp(' '); disp(['AA=[', num2str(AA'), 
'];']), nums, saycon2, disp(['ain 
num2str(ain),a fin ' num2str(a)]), 

n=25; p3d3s, disp(' '); disp(['AA=[', 
num2str(AA'),'];']), nums, saycon2, 
disp(['ain num2str(ain),a fin ' 
num2str(a)]) 

(etc... 4 more times) 

11. Part of the batch command 


Figure 10. Graphic output of EVE, initial acton “wolf come,” No. 53. 

The following is part of text output, C= 0.5; G = 1; H = 0.8; F= 0.2; , cycles 75:100. 


AA=[54 50 49 48 37 50 

10 37 2 sent first 

eat first say ; 

21 32 56 49 56 45 62 49 

11 50 2 straw build 

first say man ; * 

62 56 62 61 57 28 37 55 

12 54 2 come house 

give_straw_build ; 

46 54 62]; 

13 49 3 give first 

wolf ask first; * 


14 56 3 ask first 

wolf ask enter; * 

content: 

15 62 4 eat first 

wolf eat first; * 

1 21 1 build 

condensate: 

total content 15 

2 28 1 eat 


condensate 13 

3 32 1 first pig 

sent_first_pig ; * 

narr. cond.7 

4 45 1 first say 

give_first_pig ; 

JINI = 0.79 

5 46 1 say man 

ask_first_pig ; 

cond/cont = 0.87 

6 48 1 give straw 

eat_first_pig ; 

nr.cnd/cnt = 0.47 

7 55 1 wolf ask 

sent first say ; 

ain 48, a fin 62 

8 57 1 ask enter 

give_first_say; * 


9 611 wolf eat 

ask first say; 



The end comments in the output show the sizes of content, condensate, and narrow 
condensate, Jini inequality coefficient, and the size ratios of condensate and narrow 
condensate to content. 
















37 


Asterisks mark triplets, the starting doublets of which, for example, sent_first + 
firstpig -> sent_first_pig , have the difference of coordinates in WORLD equal to not 
more than 5. This difference (condensate range) can be varied. I remind that WORLD 
approximately preserves the order of LISTENING, i.e., the order of events in the story 
(Leibniz: time is order of a sequence of events). 

The consistency of the narrow condensate, as I call it, improves, as expected, although 
without much room for further stitching. Thus, the confusing “eat first say” and 
“ask_first_say” do not get into the “thought soup” : 





first_say_man * 
give_straw_build ^ 
wolf_ask_first * 
wolfaskenter * 
wolf eat first * 


[ Sow] sent first pig. “Give,’ 
First say, “man, give straw 
build.” Wolf ask First, wolf 
ask enter, Wolf eat First 




Figure 11 . Transmission of thought from condensed thought soup to 
SPEAKING to LISTENING and toward UNDERSTANDING 


A fragment of the story is approximately retold, with gaps. 


Let us take the case presented in Figire 10-7, C = 1,G = 0.1,H=1, F = 0.1, cycles 
126:150. It is marked by a border. With influence of neighbors as low as G=0.1 we 
would expect a very rambling THINKING. Yet with low forgetting F and high C it looks 
well focused: 


content: 

1 18 1 say 

2 24 1 ask 

3 45 1 first say 

4 55 1 wolf ask 


5 60 1 blow house 

6 57 2 ask enter 

7 613 wolf eat 

8 62 3 eat first 

9 22 5 wolf 

10 59 7 wolf blow 


condensate: 
eat_first_say ; 
wolf_ask_enter; * 
wolf blow house ; * 
wolfeatfirst; * 


Wolf ask enter, wolf blow house, wolf eat first. This is the essence of what happens 
after wolf come. 


It might be of interest how content in this particular and strange (Figure 10-7) case 
progresses with time (1:150) from empty condensate. It tells something about the 
kinetics, as the chemists say, of the process: the relatively slow speed which is typical of 



38 


most transformations in organic chemistry. A slow chemical kinetics was considered an 
evidence of a bottleneck and a major stimulus for the theory of transition state. 


1:25 content: 

1 21 1 build 

2 314 pig three 

3 59 4 wolf blow 

4 617 wolf eat 

5 22 9 wolf 
condensate: 
[empty] 

26:50 content: 

1 27 1 blow 

2 23 2 come 

3 55 2 wolf ask 

4 57 3 ask enter 

5 59 3 wolfblow 

6 53 5 wolf come 

7 22 9 wolf 
condensate: 
wolfaskenter ; * 

51:75 content: 

1 20 2 house 

2 53 2 wolf come 

3 55 2 wolf ask 

4 60 2 blow house 


5 22 3 wolf 

6 27 4 blow 

7 614 wolf eat 

8 59 6 wolfblow 
condensate: 
wolf_blow_house ; * 

76: 100 content: 

1 22 1 wolf 

2 511 build house 

3 53 1 wolf come 

4 60 2 blow house 

5 27 8 blow 

6 59 12 wolfblow 
condensate: 
wolf_blow_house ; * 

101: 125 content: 

1 27 1 blow 

2 55 1 wolf ask 

3 60 1 blow house 

4 20 2 house 

5 23 2 come 

6 53 2 wolf come 

7 612 wolf eat 


8 22 3 wolf 

9 54 4 come house 

10 59 7 wolfblow 
condensate: 
wolf come house ; 
wolf_blow_house ; 

126:150 content: 

1 18 1 say 

2 24 1 ask 

3 45 1 first say 

4 55 1 wolf ask 

5 60 1 blow house 

6 57 2 ask enter 

7 613 wolf eat 

8 62 3 eat first 

9 22 5 wolf 

10 59 7 wolfblow 
condensate: 
eat_first_say ; 
wolf ask enter; * 
wolf_blow_house ; 
wolf eat first; * 


* 

* 


* 


Thinking could be a slow, truly chemical process, but the result can come any time, 
which is an evidence that human thinking is a random process in a system of a moderate 
size. The smaller the system, the sooner the result (i.e., arrival to stability) . This is why 
the standard way of solving a difficult problem is to simplify it and to think for a long 
time, I am not sure in which order. Henri Poincare and Jacques Hadamard left classical 
descriptions of the process of mathematical creation, which illustrate the major kinetic 
properties of thinking . 


MEYER'S LAW : “It is a simple task to make things complex, but a complex task to 
make them simple.” One of the reasons why democracy can be as dysfunctional as 
currently in the USA, is either impossibility of simplification or fast pace of 
uncontrollable events, or both (see APPENDIX 2). Problem solving could be one of 
future tasks for EVE and it may turn unsolvable. 


The right choice of parameters C, G, H, and F or just a longer thinking can probably 
improve SPEAKING. 


The left column in Table 2 shows the combined narrow condensate from 150 cycles at 
C = 1, G = 1, H = 1, F = 1; the last 25 cycles are shown in Figure 10-5. Other columns 
are some of possible longer derivations. I assume that evolutionary contraction comprises 
elimination of repeating doublets, as well as singlets, and works vertically, as well as 



39 


horizontally, albeit with the inevitable uncertainty of linearization. Note that wrong 
thoughts are easily generated, too. Animals and humans make mistakes and what can be 
more natural than an error or an excessive enthusiasm which is often its precursor? 

Wolf is not an object of any action and this is why there is no bond to it until the end of 
the full tripartite story, not yet told. Meet_man_give seems clever, but less 
straightforward: indeed, the distance between meet_man and man_give in WORLD is 
exactly 5, the maximum. 


Table 2. Derivation of longer thoughts from narrow condensate 


Narrow 

condensate 

Horizontal derivations 

Vertical derivations 

askfirstbuild; 

ask_ first build house 

first: meet man, say man, 

firstbuildhouse; 

first meet man give first, 

build house 

first meet man; 

first say man; 

first say man give straw, 


givefirstbuild; 

first meet man give straw 


give_first_say; 

give first say man bundle 


man give first; 

first meet man give first build house 

man give: first, straw 

man_give_straw; 

meetmangive; 

first meet man give straw 

saymanbundle; 

say_man_give; 


say man: give, bundle 

sent first meet; 

sent_first_pig; 

sow_sent_first; 

sow_sent_seek; 

wolfaskenter; 

sowsentfirstmeetman (dubious) 

sow sent: first, seek 

wolfaskfirst; 

wolf ask first build house (wrong!) 

wolf ask: first, enter 

wolfblowhouse; 

wolfeatfirst; 


wolf: blow house, eat first 


The vertical contraction of 

first_meet_man_give_first_build_house and 
first_meet_man_give_straw 

gives: 

first_meet_man_give_straw_first_build_house . 


At this point, the most significant and remarkable result for me is 
not the possibility of generating long speech-ready thoughts from 
doublets, but the fact that doublets alone are sufficient for verbal 
communication. Nean seems like a real language of thought. 
Nevertheless, looking at Table 2 I have a feeling of a glass wall 
separating me from the full-blown language. 








40 


EVE is just dying for a grammar which would smoothly, without a bottleneck, handle: 

mangivefirst + man give straw man give first straw Of man give straw first 


EVE’s behavior definitely gives (at least to myself) some food for thought and generates 
questions which are halfway to understanding. 

Doublets first_say (45) and sayjno (58) are adjacent in the last part of the story, but 
first_say_no is not in the narrow content because first_say occurs twice, but has only 
one, much earlier coordinate (45), when first talks to man. I let it be so out of curiosity, 
not knowing what to expect, and here is the answer. 

The impossibility to have first_say_no* in the narrow condensate poses a paradoxical 
problem: how to resolve this shortcoming without using my own creativity. 

A possible way is to somehow use the natural signals of the end 
of phrase, like pause, intonation, or a word in a fixed syntactic 
position at the end of phrase, as, for example, the Japanese verb. 

Russian colloquial “nu” and its English counterpart “well” in the 
beginning of a spoken phrase convey the start of a thought 
expression. 1 believe the sterility will be preserved if we simply 
consider them (and punctuation marks) NAMES, but this seems 
already far beyond Nean. An intriguing question: when 1 thi nk 
about fixing a problem in EVE, do 1 follow the same pattern as the 
evolving natural language did? Yes, but not as a blind watchmaker. 

Nu, well, how can 1 be sure? 



It has taken me some time to notice that first_say is the only doublet that has been 
classified by SCALE as old during LISTENING. Therefore, something must happen with 
it on the second occurrence which does not happen with the rest of NAMES. What 
“second occurrence” implies is the single continuous timeline (history) of the system, 
which is the intrinsic property of an exystem and part of its definition. At normal 
conditions no memory of an exystem can be externally manipulated, even if it is an 
Orwellian exystem. 

While I am developing some ideas about improving EVE, I realize that they are not 
sterile. I do not have a clear vision how the natural EVE could arrive to any such step on 
its own. But am I really outside EVE if it is a creation of my mind? I know 
introspectively that my thinking is not algorithmic and, therefore (can we even say 
“therefore?”), as chaotic as the movements of a trapped animal. I assume that in a way it 
is a recapitulation of some stages of phylogenesis of the mind. EVE stimulates asking 
difficult and troubled questions. The deepest among them is how the phylogenesis of 
mind can be spontaneously recorded, i.e., what is the genome of the mind? In molecular 
chemistry and AI this question is easy to answer. 


41 


There is another—smaller—problem: what to do about erroneous outer thoughts like 
wolfaskfirst + askfirstbuild + firstbuildhouse wolf ask first build house? 

Is it enough to say that to err is human? It could be, of course, an honest mistake of a 
child retelling the story or answering a question. Now we are talking about possible 
psychological, not digital, experiments, but I intend to stay with pattern chemistry which 
is trans-disciplinary. 

Evolution is a sequence of solved problems: a steeplechase of successfully passed 
transition barriers , speaking chemically, known as punctuated equilibrium in biology. 

I suggest the sterility condition for evolutionary games as: 

The solution must be so simple that it could occur at random, 

like the variation of the parameters and the size of the spotlight at SPEAKING. I must 
confess, however, that I had invented the variable spotlight myself (it suddenly occurred 
to me after intense chaotic, soup-like thinking) and found the excuse later. I hope this 
illustrates what I mean by intellectual infection and how difficult it is to avoid it in 
Natural Artificial Intelligence. 

The above results illustrate, in my opinion, the importance of the knowledge acquisition 
through stories, procedures, and, generally, sequences, which opens, thorough the 
evolutionary feedback, a way to complex knowledge. This is why children should be told 
consistent stories. Origin of language is inseparable from the actual living conditions, 
environment, and material culture. Its closest pattern-chemical parallel is the origin of the 
cell genome. To fantasize further, the consciousness, which is the cauldron for the 
momentary batch of the thought soup, should not be either too big, or too small. 

Moreover, there could be various levels of wider subconsciousness below the narrow 
working consciousness. 

In big systems some combinatorial situations can never realize, while in small systems 
they constantly repeat but do not have much choice. 

A speculative discussion of all these problems could be endless. This is a good time to 
stop SPEAKING and do more THINKING. 



42 


8. CONCLUSION 


No firm conclusion can be made from the first preliminary computer games. 

But what exactly does “firm conclusion” mean? 

The value of computer simulations is a vast and controversial topic. Manfred Eigen’s 
entire work in chemical evolution could be questioned in this regard, not to mention the 
whole boundless area of the hectic activity of modem descendants of mythical centaurs: 
humans fused with controlled computer and computers fused with controlled humans. 

I tend to draw a sharp distinction between realistic models, like weather, bridge, and 
airplane, and speculative modeling, like ALIFE (Artificial Life) and econophysics. Ulf 
Grenander’s GOLEM belongs to the second category. On the contrary, Pattern Theory of 
medical image processing belongs to the first type. 

The significance of the first kind is that it generates knowledge, while the second kind 
generates understanding. The whole Pattern Theory is, first of all, a way to understand 
the complex world as a whole, regardless of what money we can make on that. 
Knowledge today is for sale, but understanding is still almost free. Have in mind, 
however, that understanding can be wrong, while knowledge is testable and certifiable. 

I start Nean with extreme simplicity, but very soon the horizons of extreme complexity 
come near, within an arm distance. In order to draw even speculative, but somewhat 
firmer, conclusions, I have to explore so many combinations of so many factors, that with 
the means at my disposal it would take a second life. Besides, I believe that for the sake 
of knowledge the mind should be built, not just simulated. Chemists just go to the bench 
and do what has to be done, quite like the heroes of John Wayne. 

Manfred Eigen was earlier in his life awarded Nobel Prize for fine experimental work on 
“immeasurably fast” chemical reactions, based on an ingenious theoretical idea. Much 
later he expressed (see long interview ) the relation between theory and experiment in this 
way: “No theory without experiment and no experiment without theory.” Obviously, he 
could not be satisfied with speculation alone and his work after 1980 became 
experimental. Note, however, that it is not yet about origin of life and beginning of 
evolution, but about replication of biopolymers. 

I stop here with a hope to be able to continue. I see the significance of the first 
speculative “experiments” in making the first simple step from simplicity to complexity 
and discovering that it in fact complexity emerges if we repeat simple steps. Whether it 



43 


all is true, false, or irrelevant (Manfred Eigen’s tree-fold classification), I cannot say. The 
experiment, ultimately, would consist in educating a young EVE until it satisfies the 
Turing test even at a level much below Commander Data from Star Trek. It should be a 
machine with personality, capable of honest errors and progress. 

Why is it important to be natural? Here is an example. The choice of G is especially 
difficult because words differ in the number of connections (arity, Ar ). I have not yet 
decided on the choice of G and use in saycon2 a coefficient B in B G : B = 1 / VAr . 

To tinker with parameters, however, is a cumbersome and very Al-ish work. A much 
more daring step toward naturalization would be to let the model search for maximum 
stability in the interaction with the human environment, registering the frustration or 
satisfaction of the humans around the model, as well as the “mood” of EVE. The idea, on 
which I have a fixation, that homeostasis is in the core of intelligence comes from Ross 
Ashby. 

I must acknowledge that I have been ignoring the growing and exciting area of research 
on embodied agents and embodied evolution, including evolution of language, to which 
Stefano Nolfi has kindly drawn my attention. The word embodied here has no esoteric 
connotation and means simply that the objects of study are material objects and not just 
manipulation of digits in computers. In my opinion it is the direction which both John 
Wayne (whose sharp shots were fake) and Ross Ashby (whose rambling homeostat was 
100% real) would welcome. Robots consume and dissipate energy, without which no 
exystem is possible. They can compete for a limited resource of energy, matter, and space, 
which is all we want from an avatar (move aside, John Wayne!) of the real life. I am 
especially interested in the stability dynamics of robots and their societies and whether 
their behavior is of the Levi walk type. How would they interact with a human? A dog? 

It is my uneducated impression that embodied evolution has one important thing in 
common with classical artificial AI: the “genome” of neural networks, i.e., the feature 
that can be varied at random and submitted for selection, is explicit, observable, and 
controllable, quite like the molecular genome. This is a very strong realistic side of it. A 
big general question is: could that be in principle different for natural AI and, for that 
matter, natural-natural intelligence? Obviously, the parameters in EVE can be varied at 
random, but EVE seems less deterministic because random process is the very essence of 
blind selection. Biological selection is probabilistic and has no goal. Artificial AI 
selection is done by evaluating the result, like in sports. 

This problem has an eerie aura of logical paradoxes and I do not yet have an answer. It might be 
impossible to get it other than from neurophysiology, but if so, is pattern chemistry of any value? On 
value is certain: it stimulates questions. 

If not a firm conclusion than at least a soft one: 

How far can we progress in understanding the origin of thought and language? “How far 
back can we go ahead,” should I say because the question addresses a very distant past. 

The answer is: as far as a general theory permits. In my firm opinion, this general theory 
cannot be a theory of origin of thought and language, but a theory of any origin of 



44 


complex systems, in particular, exystems. This theory, as any theory of natural process, 
should have its own kind of generalized physics and chemistry which could tell us what 
mental constructs are more realistic than others. For two major reasons Pattern Theory 
(Ulf Grenander) is the only paradigm of this caliber: (1) in its trans-disciplinary approach 
it considers human thoughts, material objects, and anything between them on an equal 
basis as combinatorial structures and (2) it distinguishes between more probable and less 
probable structures and their transformations. 

Finally, I want to accentuate that this project comes from Ulf Grenander’s “Patterns of 
Thought ” and the foundation which he had laid down around 1970, the time of many new 
and fundamental ideas in science, some still neglected or overwhelmed by faster sellable 
trends. For more on the mystery of 1970 see Introduction to Pattern Chemistry . 

Shouldn’t I also mention other predecessors, from Kant and Plato to the first speakers of 
Nean, in reversed order? 

It was Hegel who tried to pull out the big and complex world (in a trans-disciplinary fashion!) 
from the hat filled simply with being (das Sein ) and nothing (das Nichts). Thus, if King 
Solomon says that there is nothing new under the sun, obviously, the opposite idea must be so 
reasonable that it needs to be denied. Niels Bohr summarizes: “The opposite of a trivial truth 
is false; the opposite of a great truth is also true.” 

MATLAB scripts (amateurish) are published separately and referred to in APPENDIX 3, 
next pages. 

email 


Last updated: November, 2010 


A 


First version published: January 29, 2010 








45 


APPENDIX 1. PREVIOUS RELATED E-PUBLICATIONS 

(also stored at SCRIBD) 

Molecules and Thoughts: Pattern Complexity and Evolution in Chemical Systems and the 

Mind (2003) ; SCRIBD 

TIKKI TIKKI TEMBO and the Chemistry of Protolanguage (2004) ; SCRIBD 
Pattern Theory and “Poverty of Stimulus” Argument in Linguistics (2004) ; SCRIBD 
The Three Little Pigs : Chemistry of language acquisition (2005) ; SCRIBD 
Salt: The Incremental Chemistry of Language Acquisition (2005) ; SCRIBD 
Salt 2: Incremental Extraction of Grammar by Simplistic Rules (2005) ; SCRIBD 
The Chemistry of Semantics (2005) ; SCRIBD 
Do Piraha speak Nean? (2007) ; SCRIBD 


APPENDIX 2 . SOME QUESTIONS AND ANSWERS 

The following questions and answers may help clarify the approach of pattern 
chemistry. 


1. Q: Why do I need to refer to the Great Depression or grumble about the 
dysfunctional American government in an opus on origin of language? 

A: Pattern chemistry, as, more generally, Pattern Theory, is trans-disciplinary. It sees 
the world through the filter of patterns. Pattern is a space of configurations related 
through a similarity transformation. Transformation makes sense for not less than two 
configurations. The more configurations, the more distinct the pattern. 

2. Q: In Bourbaki’s scale of sets, a combination of elements of a set enters the set as a 
new element. If this is a general pattern of complexification, what prevents it in nature? 
It is said usually that brain contains billions of cells and even larger number of their 



















46 


connections. If so, brain could be as lost in its own complexity as the American political 
system. Yet even some college dropouts can make swift and clever decisions in complex 
matters. What limits the connectivity? 

A: The actual history of the system. Only one or a few out of many combinatorial 
possibilities in fact happens. History of an exystem squeezes out a thin spaghetti from 
the combinatorial space of connectivity. 

3. Q: Marco Mirolli and Domenico Parisi ask: How can we explain the emergence of a 
language which benefits the hearer but not the speaker? (Connection Science, 17(3-4): 
307-324). I take the question just as an example, but how would pattern chemistry 
answer this and similar questions about benefits and fitness? 

A: The idea that beneficial changes are preserved and evolution is survival of the fittest 
is the key tenet of Darwinism. Yet it has always looked like a tautology. The statement 
that the world was created in six days does not look circular at all and, although decently 
false, is capable of winning some minds. The paradigm of embodied artificial evolution is 
more consistent: fitness is defined as performance of a certain function. If something 
extra happens, it is recorded, and so be it. The very essence of neural network requires 
somebody or something to approve or reject the state of the network and the distribution 
of connectivity and weights. 

The answer of pattern chemistry is typically chemical: evolution is a search for stability. 
Since both hearer and speaker are subsystems of the same exystem, even if they do not 
cooperate or interact in any way, the property of verbal communication stabilizes the 
system by adding bonds. There is more finesse, however: if two trends compete, the 
fastest is more probable to win or dominate. This releases us from the duty to justify 
haplology, contraction, loss of grammatical gender and case endings, or any other 
simplification in the evolution of language. If I am not mistaken, Kolmogorov 
complexity (although I am not a fan of it) is, roughly, the minimal and, therefore, the fastest 
to transmit length of a certain string of symbols. Faster and slower are the prime terms 
of chemical kinetics. 


APPENDIX 3 : SCRIPTS and WORKSPACE 


Any trajectory AA (as a vertical array: AA =[....]' ) can be replayed by script repro 
(in WORLD space) or script tr3d in time, quite like a video. 

Codes are in separate APPENDIX file , periodically revised, 
last revision : February 19, 2010 





47 


scripts and workspace as text file: http://spirospero.net/eve-scripts.txt 
scripts and workspace as pdf http://spirospero .net/eve-scripts .pdf 


http://www.scribd.com/doc/26046555/APPENDIX-to-PATTERN-CHEMISTRY-OF- 

THOUGHT-AND-SPEECH 


LIST OF SCRIPTS: 

Part 1: SCALE 

1.SCALE 

2. LINK 

3. PLOTW 



plotw (animated) 


i 

Graphic output of 


Part 2: EVE 

1 . 2 . 

1. 1N IT 

0 . 8 . 

2. P3D3 

0.6 

3. SELECT3 

0.4 

4. NUMS 


5. SAYCON2 

0.2 

5A. SAY CON 

0 

6. STACK 


7. TR 


8. TR3D 



9. REPRO 

(replays AA) 

10. REPR02d 

(2D distribution of actons) 



Graphic output of repro. Axis Z is 
space of WORLD (1:62, animated) 



























48 



Graphic output of tr3d. Axis 
Z is time (animated) 



-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 


Graphic output of repro2d 


Last updated: November, 2010 








APPENDIX to: 

Yuri Tarnopolsky 

PATTERN CHEMISTRY OF THOUGHT AND SPEECH AND THEIR 
HYPOTHETICAL ANCESTOR 


main page: 

http://spirospero.net/thought-and-speech.pdf and 

http://www.scribd.corn/doc/26045810/Yuri-Tarnopolsky-PATTERN-CHEMISTRY-OF- 

THOUGHT-AND-SPEECH 


Check for updates; last update: November, 2010 


APPENDIX contains workspace and MATLAB scripts 


scripts as text file: http://spirospero.net/eve-scripts.txt 
scripts as pdf http ://spirospero .net/eve-scripts .pdf 

http://www.scribd.com/doc/26046555/APPENDIX-to-PATTERN-CHEMISTRY-OF- 

THOUGHT-AND-SPEECH 


CONTENTS 

Part 1 Workspace 
Part 2 SCALE 
Part 3 EVE 

Part 4 Example of a batch command 
Part 5 Structure of WORLD (62x62) 











Part 1. WORKSPACE 

RECONSTRUCTION OF WORKSPACE FOR EVE 


The workspace for EVE needs only WW.mat (62 x 62) and NAMES.mat (62 x 12) 


1. RECONSTRUCTING WW 
Connectivity list generated from sparse(WW) 


WWM= [ 29,2,30,2,35,2,36,2,29,3 , 30,4,31,4,32,4,33,4,34,4,31,5 , 32,6,37,6,41,6,45,6 , 
49,6,52,6,56,6,62,6,33,7,34,8,35,9,36,10 ,37,10 ,38,10 ,39,10 ,38,11 ,39,12 ,40,12 ,40,13 
41,14 ,42,14 ,42,15 ,43,15 ,46,15 ,47,15 ,43,16 ,44,16 ,44,17 ,48,17 ,50,17 ,45,18 ,46,18 , 

58,18 ,47,19 ,48,19 ,49,19 ,51,20 ,54,20 ,60,20 ,50,21 ,51,21 ,52,21 ,53,22 ,55,22 ,59,22 , 

61,22 ,53,23 ,54,23 ,55,24 ,56,24 ,57,24 ,57,25 ,58,26 ,59,27 ,60,27 ,61,28 ,62,28 ,2,29 , 

3,29 ,30,29 ,35,29 ,36,29 ,2,30 ,4,30 ,29,30 ,31,30 ,32,30 ,33,30 ,34,30 ,35,30 ,36,30 , 
4,31 ,5,31 ,30,31 ,32,31 ,33,31 ,34,31 ,4,32 ,6,32 ,30,32 ,31,32 ,33,32 ,34,32 ,37,32 , 
41,32 ,45,32 ,49,32 ,52,32 ,56,32 ,62,32 ,4,33 ,7,33 ,30,33 ,31,33 ,32,33 ,34,33 ,4,34 , 
8,34 ,30,34 ,31,34 ,32,34 ,33,34 ,2,35 ,9,35 ,29,35 ,30,35 ,36,35 ,2,36 ,10,36 ,29,36 , 


30.36 ,35,36 ,37,36 ,38,36 ,39,36 

49.37 , 52,37 , 56,37 , 62,37 , 10,38 

38,39 ,40,39 , 12,40 , 13,40 ,39,40 

56,41 ,62,41 , 14,42 , 15,42 ,41,42 

47,43 , 16,44 , 17,44 , 43,44 , 48,44 

52,45 , 56,45 , 58,45 , 62,45 , 15,46 

42,47 ,43,47 ,46,47 ,48,47 ,49,47 

32.49 ,37,49 ,41,49 ,45,49 ,47,49 

51.50 ,52,50 ,20,51 ,21,51 ,50,51 

45,52 ,49,52 ,50,52 ,51,52 ,56,52 

23,54 ,51,54 ,53,54 ,60,54 ,22,55 

32,56 ,37,56 ,41,56 ,45,56 ,49,56 

18,58 ,26,58 ,45,58 ,46,58 ,22,59 

54,60 ,59,60 ,22,61 ,28,61 ,53,61 


, 6,37 , 10,37 , 32,37 , 36,37 
, 11,38 ,36,38 ,37,38 , 39,38 
, 6,41 , 14,41 , 32,41 
,43,42 ,46,42 ,47,42 
, 50,44 , 6,45 , 18,45 
, 18,46 , 42,46 , 43,46 
, 17,48 , 19,48 , 44,48 
,48,49 ,52,49 ,56,49 
,52,51 ,54,51 ,60,51 


62,52 

24.55 

52.56 
27,59 
55,61 


22,53 

53.55 

55.56 
53,59 
59,61 


23,53 

56.55 

57.56 
55,59 
62,61 


37,41 

15,43 

32.45 

45.46 

47.48 

62.49 
,6,52 
54,53 

57.55 

62.56 
60,59 
,6,62 


,38,37 ,39,37 ,41,37 ,45,37 , 

, 10,39 , 12,39 , 36,39 , 37,39 , 
,42,41 ,45,41 ,49,41 ,52,41 , 

, 16,43 , 42,43 , 44,43 , 46,43 , 
,37,45 ,41,45 ,46,45 ,49,45 , 


, 47,46 
, 49,48 
, 17,50 
21,52 , 
, 55,53 
, 59,55 
, 24,57 
, 61,59 
28,62 , 


58,46 , 15,47 , 19,47 , 
50,48 , 6,49 , 19,49 , 


,21,50 
32,52 
, 59,53 


, 20,60 
32,62 


44,50 ,48,50 , 

37.52 ,41,52 , 

61.53 ,20,54 , 
61,55 ,6,56 ,24,56 , 
25,57 ,55,57 ,56,57 , 

27,60 ,51,60 , 
37,62 ,41,62 , 


45,62 ,49,62 ,52,62 ,56,62 ,61,62 ]; 



Reconstruction 


1. enter: WWM 


2. enter: 

WW=zeros(62,62); lwwm=length(WWM); 

for j = (1 : 2 : lwwm), wx = WWM(j); wy = WWM(j+l); WW(wx, wy) =1; end 


Check with the original: 
» sparse(WW) 


ans = 


( 29 , 2 ) 1 

( 47 , 19 ) 1 

( 4 , 32 ) 1 

( 41 , 37 ) 1 

( 43 , 44 ) 1 

( 62 , 49 ) 

1 

( 30 , 2 ) 1 

( 48 , 19 ) 1 

( 6 , 32 ) 1 

( 45 , 37 ) 1 

( 48 , 44 ) 1 

( 17 , 50 ) 

1 

( 35 , 2 ) 1 

( 49 , 19 ) 1 

( 30 , 32 ) 1 

( 49 , 37 ) 1 

( 50 , 44 ) 1 

( 21 , 50 ) 

1 

( 36 , 2 ) 1 

( 51 , 20 ) 1 

( 31 , 32 ) 1 

( 52 , 37 ) 1 

( 6 , 45 ) 1 

( 44 , 50 ) 

1 

( 29 , 3 ) 1 

( 54 , 20 ) 1 

( 33 , 32 ) 1 

( 56 , 37 ) 1 

( 18 , 45 ) 1 

( 48 , 50 ) 

1 

( 30 , 4 ) 1 

( 60 , 20 ) 1 

( 34 , 32 ) 1 

( 62 , 37 ) 1 

( 32 , 45 ) 1 

( 51 , 50 ) 

1 

( 31 , 4 ) 1 

( 50 , 21 ) 1 

( 37 , 32 ) 1 

( 10 , 38 ) 1 

( 37 , 45 ) 1 

( 52 , 50 ) 

1 

( 32 , 4 ) 1 

( 51 , 21 ) 1 

( 41 , 32 ) 1 

( 11 , 38 ) 1 

( 41 , 45 ) 1 

( 20 , 51 ) 

1 

( 33 , 4 ) 1 

( 52 , 21 ) 1 

( 45 , 32 ) 1 

( 36 , 38 ) 1 

( 46 , 45 ) 1 

( 21 , 51 ) 

1 

( 34 , 4 ) 1 

( 53 , 22 ) 1 

( 49 , 32 ) 1 

( 37 , 38 ) 1 

( 49 , 45 ) 1 

( 50 , 51 ) 

1 

( 31 , 5 ) 1 

( 55 , 22 ) 1 

( 52 , 32 ) 1 

( 39 , 38 ) 1 

( 52 , 45 ) 1 

( 52 , 51 ) 

1 

( 32 , 6 ) 1 

( 59 , 22 ) 1 

( 56 , 32 ) 1 

( 10 , 39 ) 1 

( 56 , 45 ) 1 

( 54 , 51 ) 

1 

( 37 , 6 ) 1 

( 61 , 22 ) 1 

( 62 , 32 ) 1 

( 12 , 39 ) 1 

( 58 , 45 ) 1 

( 60 , 51 ) 

1 

( 41 , 6 ) 1 

( 53 , 23 ) 1 

( 4 , 33 ) 1 

( 36 , 39 ) 1 

( 62 , 45 ) 1 

( 6 , 52 ) 

1 

( 45 , 6 ) 1 

( 54 , 23 ) 1 

( 7 , 33 ) 1 

( 37 , 39 ) 1 

( 15 , 46 ) 1 

( 21 , 52 ) 

1 

( 49 , 6 ) 1 

( 55 , 24 ) 1 

( 30 , 33 ) 1 

( 38 , 39 ) 1 

( 18 , 46 ) 1 

( 32 , 52 ) 

1 

( 52 , 6 ) 1 

( 56 , 24 ) 1 

( 31 , 33 ) 1 

( 40 , 39 ) 1 

( 42 , 46 ) 1 

( 37 , 52 ) 

1 

( 56 , 6 ) 1 

( 57 , 24 ) 1 

( 32 , 33 ) 1 

( 12 , 40 ) 1 

( 43 , 46 ) 1 

( 41 , 52 ) 

1 

( 62 , 6 ) 1 

( 57 , 25 ) 1 

( 34 , 33 ) 1 

( 13 , 40 ) 1 

( 45 , 46 ) 1 

( 45 , 52 ) 

1 

( 33 , 7 ) 1 

( 58 , 26 ) 1 

( 4 , 34 ) 1 

( 39 , 40 ) 1 

( 47 , 46 ) 1 

( 49 , 52 ) 

1 

( 34 , 8 ) 1 

( 59 , 27 ) 1 

( 8 , 34 ) 1 

( 6 , 41 ) 1 

( 58 , 46 ) 1 

( 50 , 52 ) 

1 

( 35 , 9 ) 1 

( 60 , 27 ) 1 

( 30 , 34 ) 1 

( 14 , 41 ) 1 

( 15 , 47 ) 1 

( 51 , 52 ) 

1 

( 36 , 10 ) 1 

( 61 , 28 ) 1 

( 31 , 34 ) 1 

( 32 , 41 ) 1 

( 19 , 47 ) 1 

( 56 , 52 ) 

1 

( 37 , 10 ) 1 

( 62 , 28 ) 1 

( 32 , 34 ) 1 

( 37 , 41 ) 1 

( 42 , 47 ) 1 

( 62 , 52 ) 

1 

( 38 , 10 ) 1 

( 2 , 29 ) 1 

( 33 , 34 ) 1 

( 42 , 41 ) 1 

( 43 , 47 ) 1 

( 22 , 53 ) 

1 

( 39 , 10 ) 1 

( 3 , 29 ) 1 

( 2 , 35 ) 1 

( 45 , 41 ) 1 

( 46 , 47 ) 1 

( 23 , 53 ) 

1 

( 38 , 11 ) 1 

( 30 , 29 ) 1 

( 9 , 35 ) 1 

( 49 , 41 ) 1 

( 48 , 47 ) 1 

( 54 , 53 ) 

1 

( 39 , 12 ) 1 

( 35 , 29 ) 1 

( 29 , 35 ) 1 

( 52 , 41 ) 1 

( 49 , 47 ) 1 

( 55 , 53 ) 

1 

( 40 , 12 ) 1 

( 36 , 29 ) 1 

( 30 , 35 ) 1 

( 56 , 41 ) 1 

( 17 , 48 ) 1 

( 59 , 53 ) 

1 

( 40 , 13 ) 1 

( 2 , 30 ) 1 

( 36 , 35 ) 1 

( 62 , 41 ) 1 

( 19 , 48 ) 1 

( 61 , 53 ) 

1 

( 41 , 14 ) 1 

( 4 , 30 ) 1 

( 2 , 36 ) 1 

( 14 , 42 ) 1 

( 44 , 48 ) 1 

( 20 , 54 ) 

1 

( 42 , 14 ) 1 

( 29 , 30 ) 1 

( 10 , 36 ) 1 

( 15 , 42 ) 1 

( 47 , 48 ) 1 

( 23 , 54 ) 

1 

( 42 , 15 ) 1 

( 31 , 30 ) 1 

( 29 , 36 ) 1 

( 41 , 42 ) 1 

( 49 , 48 ) 1 

( 51 , 54 ) 

1 

( 43 , 15 ) 1 

( 32 , 30 ) 1 

( 30 , 36 ) 1 

( 43 , 42 ) 1 

( 50 , 48 ) 1 

( 53 , 54 ) 

1 

( 46 , 15 ) 1 

( 33 , 30 ) 1 

( 35 , 36 ) 1 

( 46 , 42 ) 1 

( 6 , 49 ) 1 

( 60 , 54 ) 

1 

( 47 , 15 ) 1 

( 34 , 30 ) 1 

( 37 , 36 ) 1 

( 47 , 42 ) 1 

( 19 , 49 ) 1 

( 22 , 55 ) 

1 

( 43 , 16 ) 1 

( 35 , 30 ) 1 

( 38 , 36 ) 1 

( 15 , 43 ) 1 

( 32 , 49 ) 1 

( 24 , 55 ) 

1 

( 44 , 16 ) 1 

( 36 , 30 ) 1 

( 39 , 36 ) 1 

( 16 , 43 ) 1 

( 37 , 49 ) 1 

( 53 , 55 ) 

1 

( 44 , 17 ) 1 

( 4 , 31 ) 1 

( 6 , 37 ) 1 

( 42 , 43 ) 1 

( 41 , 49 ) 1 

( 56 , 55 ) 

1 

( 48 , 17 ) 1 

( 5 , 31 ) 1 

( 10 , 37 ) 1 

( 44 , 43 ) 1 

( 45 , 49 ) 1 

( 57 , 55 ) 

1 

( 50 , 17 ) 1 

( 30 , 31 ) 1 

( 32 , 37 ) 1 

( 46 , 43 ) 1 

( 47 , 49 ) 1 

( 59 , 55 ) 

1 

( 45 , 18 ) 1 

( 32 , 31 ) 1 

( 36 , 37 ) 1 

( 47 , 43 ) 1 

( 48 , 49 ) 1 

( 61 , 55 ) 

1 

( 46 , 18 ) 1 

( 33 , 31 ) 1 

( 38 , 37 ) 1 

( 16 , 44 ) 1 

( 52 , 49 ) 1 

( 6 , 56 ) 

1 

( 58 , 18 ) 1 

( 34 , 31 ) 1 

( 39 , 37 ) 1 

( 17 , 44 ) 1 

( 56 , 49 ) 1 

( 24 , 56 ) 

1 





( 32 , 56 ) 1 

( 62 , 56 ) 1 

( 46 , 58 ) 1 

( 27 , 60 ) 1 

( 59 , 61 ) 1 

( 49 , 62 ) 

1 

( 37 , 56 ) 1 

( 24 , 57 ) 1 

( 22 , 59 ) 1 

( 51 , 60 ) 1 

( 62 , 61 ) 1 

( 52 , 62 ) 

1 

( 41 , 56 ) 1 

( 25 , 57 ) 1 

( 27 , 59 ) 1 

( 54 , 60 ) 1 

( 6 , 62 ) 1 

( 56 , 62 ) 

1 

( 45 , 56 ) 1 

( 55 , 57 ) 1 

( 53 , 59 ) 1 

( 59 , 60 ) 1 

( 28 , 62 ) 1 

( 61 , 62 ) 

1 

( 49 , 56 ) 1 

( 56 , 57 ) 1 

( 55 , 59 ) 1 

( 22 , 61 ) 1 

( 32 , 62 ) 1 



( 52 , 56 ) 1 

( 18 , 58 ) 1 

( 60 , 59 ) 1 

( 28 , 61 ) 1 

( 37 , 62 ) 1 



( 55 , 56 ) 1 

( 26 , 58 ) 1 

( 61 , 59 ) 1 

( 53 , 61 ) 1 

( 41 , 62 ) 1 



( 57 , 56 ) 1 

( 45 , 58 ) 1 

( 20 , 60 ) 1 

( 55 , 61 ) 1 

( 45 , 62 ) 1 




2. RECONSTRUCTION OF NAMES. mat , size 62x12 


NAMES=char ('!' , 'sow 'old 'pig 'three ’first ’second’, 'third ’, 'poor 
’, ’sent ’, ’away ’, ’seek ’, ’fortune ', 'meet ’, 'man ’, 'bundle', 'straw ’, ’say ’, 
'give ’, ’house ’, 'build ’, ’wolf ’, ’come ’, ’ask ’, 'enter ’, 'no ’, ’blow ’, ’eat ’, 
’sow old ’, ’sow pig ’, 'pig three ’, ’first pig ’, ’second pig ’, 'third pig ’, ’sow poor ’, 
’sow sent ’, ’sent first ’, ’sent away ’, ’sent seek ’, 'seek fortune’, ’first meet ’, ’meet 
man ’, 'man bundle ’, 'bundle straw’, ’first say ’, ’say man ’, ’man give ’, 'give straw 
’, 'give first ’, ’straw build ’, 'build house ', ’first build ’, ’wolf come ’, ’come house ’, 
’wolf ask ’, ’ask first ’, ’ask enter ', ’say no ’, ’wolf blow ’, 'blow house ’, ’wolf eat ’, 
'eat first’); 


Part 2: SCALE 

1. SCALE 

% PROGRAM SCALE, script sc; RELATED: link; 

% 1. INITIALIZE 

S=input(’ 1 to start, 2 to continue ’); 
nn=input(’enter number of cycles (nn): ’); 

disp( ’ ATTENTION: Interspaced components should be entered’) 
disp( ’ between apostrophes as single string of characters'); 
disp( ’ ’); 

disp( ’ press any key to continue’) ; pause; 

% 2. START 

for n=l:nn, if (S==l)&(n==l), sw=l; W=zeros(sw, 16,8); 
NAMES=[’!’]; WW=zeros(sw,sw);NM=[ ]; 
end 



% 3. INPUT 

I=input('enter components ’); 

% 4. CONVERT INPUT into PIN; 

PIN=zeros(8,8); word=[];jj=l; I=[I,’ ’]; ab=abs(I); lab=length(ab); 
for i=l:lab, if (ab(i))~=32, word=[word,ab(i)]; else 
lw=length(word); PIN(jj,l:lw)=word; word=fl; jj=jj+l; end, end 

% 5. CHECKING novelty of PIN against WORLD 

old=0; wiold=0; comp=[]; % PIN presumed new 

for wi=l:sw %wi: length of World 
% 5.1 CUTTING FLAT SLICE OF WORLD W(wi,:,:) 
fW=zeros(16,8); fW(:,:)=W(wi,:,:); 

% 5.2 COMPARE flat W and PIN as ordered sets % 

eqid=isequal(fW(9:16,:),PIN); if eqid==l, old=l; %PIN is OLD 

disp (’This is old ’),disp(NAMES((wi),:)), % wiold=wi; list=find(WW(:,wiold)); 

leli=length(list); SN=[];for k=l:leli, SN=strvcat(SN,NAMES((leli),:)); 

end 

end 


% 5.3 COMPARE flat W and PIN as sets 

set=0; 

if isempty(setxor(fW((l:8),:),PIN,’rows’)),if old==0, 
if eqid~=l, 

note = [’Looks like W’ ,int2str(wi),’: ’, NAMES(wi,:)]; 
disp(note); set=l; end, end,old=l; end 

%%%%%% 

if set=l, 

nnew=input(’ Is it new? 1/0 ’); 
if nnew==l,old=0; end, end 

% 5.4. CHECK for SET MEMBERS to create SPECTRUM; 

eqcomp = intersect(PIN,fW ,’rows’); h=size(eqcomp,l) ; test=zeros(l,8); 
for hh=l:h, 

if ((Asempty(eqcomp))&(~isequal(eqcomp(hh,:),test))), 
comp=[comp,wi];end, end 
end %for wi=l:sw 

spectrum=unique(comp);les=length( spectrum); 

%disp(’spectrum: ’); %for r=l:les, disp(NAMES(spectrum(r),:)); end 
%if old==l,SNU=unique(SN,’rows’);disp(’OLD spectrum ’),disp(SNU),end, 

% 6. IF NEW, EXPAND THE WORLD: 

% 6.1. NAME THE NEW ENTRY 


if old==0, NAME=input(’This is new. Name it ’) ; 
NAMES=strvcat(NAMES, NAME); 

% 6.2. CREATE NewPIN containing NAME 



NAME=[NAME,' ’]; NewPIN=zeros(8,8);word=[];jj=l; ab=abs(NAME); 
lab=length(ab); for i=l:lab, if (ab(i))~=32, word=[word,ab(i)];else 
lw=length(word); NewPIN(jj,l:lw)=word; word=[];jj=jj+l; end, 
end 

% 6.3.ADD PIN and NewPIN to WORLD 

sw=sw+l; WW(sw,:)=0; WW(:,sw)=0;% PLACE IN THE WORLD 
W(sw,9:16,:)=NewPIN(:,:); W(sw,l:8,:)=PIN(:,:); 

% All 16 lines in WORLD are filled up 
% 6.4. RECORD NEW LINKS 
for 11=1 :les, if ~iscmpty(spectrum( 11)), 

WW(sw,spectrum(ll))=l; WW(spectrum(ll),sw)=l;end, end, 
end 

% if old==0, 

%To read an entry from W(x,y,:) 

%D=nonzeros(W(x,y,:)); D=D'; setstr(D); 

end %forn=l:nn 

NM=NAMES; world=WW; 

disp(’If you want to check WORLD for a name, type: link ') 
disp(’To see NAMES, type: NAMES or NM ’); 
disp(’To display the 3D world, type plotw '); 


2. LINK 


% CHECKING THE WORLD FOR NAME; SCRIPT: link 

I=input(’NAME to check'); 

CI=cellstr(I); 

SC=nonzeros(stnnatch(CI,NAMES));%find NAME’S numbers 
LINKS=find(WW(:,SC)); links=LINKS’; 

LL= [ ]; %list of links 
NAMELIST=[]; 
for i=l:length(SC) 

LL= [LL,(find( WW (:, SC(i))))']; 
end 

lel=length(LL); 

for r=l del, NAM=(NAMES(LL(r),:)); NAMELIST=strvcat(NAMELIST,NAM);end, 
NAMELIST=unique(NAMELIST,’rows'); disp(NAMELIST) 


3. PLOTW 



%script: plotW, plot 3D WORLD 
%tic 

axis ([-1.2, 1.2,-1.2, 1.2, 0, 1.2]); stack, om=length(WW); 

PX=zeros(om,l); PY=zeros(om,l); PZ=[l:om]; PZ=PZ'/om; 
angles=(2 *pi/om). * [0: om]; 

for j=l:om, PX(j)=cos(angles(j)); PY(j)=sin(angles(j)); end 

cor=zeros(om,4); cor (:, 1) = [l:om]’; cor (:, 2)=PX; cor (:, 3)=PY; cor (:, 4)=PZ; 
for j = 1: om, 

fw= find(WW(j,:)); lfw=length(fw); 
for k=l: lfw 
kk=fw(k); if kk <=j , 

X= [cor(j, 2) cor(kk,2)]; Y= [cor(j, 3) cor(kk,3)]; Z= [cor(j, 4) cor(kk,4)]; 
pause(O.l); plot3 (X,Y,Z,'Color','black','LineWidth', 1 ); 
end, 

end, pause(O.l); 
end 
%toc 



Graphic output of plotw (animated) 


i 

-1 
































Part 3: EVE 


1. INIT 

%open workspace 

om=length(WW); P (1 : om, 3 ) = 1/om ; 
angles=(2 *pi/om). * [0: om-1 ]; 

% AR: arity 

%AR=0;for i = 1 :om , line = find ( WW (i, :) ); sl=length (line); AR(i)= si; end 
%for k=l :om, NUMBERS(k,l)= k , end; 


2. P3D3 

%PIG, script p3d3.m, 

axis ([-1.1, 1.1, -1.1, 1.1, 0, 1.1]); hold on 
%DISPLAY NUMBERS open p3d62.fig 

LC=[1 1 0]; 

%t = input ('Enter pause between the cycles t ') % t=0.1 sec 

tra=[]; % tra: trajectory 

om=length(WW); AA=zeros(n, 1); % AA contains subsequent actons 
ain=a; %ain: storage of the initial acton; 

PZ=[l:om]; PZ=PZ’/om; 

%%%ARITY CALCULATION 

%%%%%for y=l:62, lfw=length(find(WW(y,:))); P(y,4)=lfw; end 

%DISPLAY PARAMETERS IN FIGURE 

%PAR12=['G=',num2str(G)];text (-1.2,-1.2, 1, PAR12); PAR12=['H=',num2str(H)];text 
(-1.2,-1.2, 1.2,PAR12); PAR12=[’C=',num2str(C)];text (-1.2,-1.2, 1,PAR12); 
PAR12=['F=',num2str(F)];text ( -1.2,-1.2, 0.9,PAR12); PAR12=['a=',num2str(aa)];text 
(-1.2,-1.2, 0.8,PAR12); PAR12=['acton =’,NAMES(aa,:)];text (-1.2,-1.2, 0.7,PAR12); 
%PAR12=['n-,num2str(n)]; text (-1.2,-1.2 ,0.6,PAR12); 


% % % % PURGE INITIAL PROBABILITIES P. 

% P (1 : om, ) = 1/om ; % Equal probabilities. 

% or : use script "init" 



%%%% PREVIOUS ACTON DATA 

P(l,3)=0 ; % empty word ’!' cannot be selected 
P(ain,3)=0; pra=ain; %pra : previous acton 

% DISPLAY ain as green circle 

text (0.9*(cos(angles(ain))), 0.9*(sin(angles(ain))), 0, 'o','color', ’green’,'FontSize',16 ); 
for jj=l :n % MAIN LOOP BEGINS 

% START ACTON SELECTION 

select3 

% END ACTON SELECTION 

% ACTON ASTERISK DISPLAY; Randomization of the position of ASTERISK. 

% RANDOMIZATION OF SIGN 

u=rand/30; uu=rand; if uu<=0.5, uuu=l; else uuu=-l; end, u=u*uuu; 
v=rand/30; vv=rand; if vv<=0.5, vvv=l; else vvv=-l; end, v=v*vvv; 

%text (( cos ( angles (a))+u) , (sin (angles(a))+v) , ’color’, [1 0 0], ’FontSize’ ,18); 
%pause(t), 

text((cos(angles(a))+u),(sin(angles(a))+v),PZ(a),'*','color',’red','FontSize’,18); 

pause(O.l), 

%text ( 0.8*( cos ( angles (a))+u) , 0.8*(sin (angles(a))+v) , 'color', ’green’, 
'FontSize' ,18), 

%!!!!!!!!!!!!!!!!!!!!!!! 


LLX= cos(angles(a))+u; 

LLY= sin(angles(a))+v; 

LLZ=PZ(a); LLC=[LLX LLY LLZ] ; 

axis ([-1.2, 1.1,-1.1, 1.1,0, 1.1]); 

X1=LC(1); X2=LLC(1); Y1=LC(2); Y2=LLC(2); Z1=LC(3); Z2=LLC(3); 

X=[X1 X2]; Y=[Y1 Y2]; Z=[Z1 Z2]; hold on, 
plot3(X,Y,Z,’Color’,'red','LineWidth', 1 ); 

LC=LLC ; 

%END ASTERISK DISPLAY 
% END OF PROTO FOLLOWS 


AA (jj , 1) = a; 
end 


tra = AA’; 




PP=zeros(om,2); PP(: ,1) = P( 1 : om ,1); PP (: , 2 ) = P ( 1 : om , 3); 


%disp ('tyP e PP lor probability distribution '); 
%disp ('type tra for trajectory' ); 

%disp(’'), disp (['total content', num2str(ld)] ); 
%disp (['condensate size num2str(ld)] ); 
%disp (['JINI = ', num2str(JINI)]); 

stack 


3. SELECT3 


%ACTON SELECTION 

% START PROBABILITY DISTRIBUTION 

pra=ain; 

P(pra,3) = H*C ; % influence of former acton on neighbors; for calculating influence 
only 

for k=l :om, 

INF=0; line = find ( WW (k , :) ); sl=length (line); ar= sqrt(sl); 
for j=l :sl, neighbor=line(j); INF=INF+G*P(neighbor, 3)/ar; end, 

%INFLUENCE 

P(k,3)=P(k,3)- F * P(k,3 ) +INF; 

end 

P(a,3)= 0; P(l,3)=0; %P(a,3) reverted to 0 

%END PROBABILITY DISTRIBUTION 

% START ACTON SELECTION 

%P(a,3)=0; %cannot be selected again 
pra=a; % previous acton "a" remembered as "pra" 

P(l,3)=0; SUM=0; ss=0; S=P(l:om , 3); SUM=sum(S); 

P(l:om , 3)=P( 1 :om,3)/SUM; 

%probabilities are normalized 
r=rand; for i=l:om, ss=ss+P(i,3); if ss>r, a=i; break, end 

end %NEW ACTON "a" HAS BEEN SELECTED 
P(pra,3)=C; %new probability for former acton; now it can take part in next selection 
%P(1,3)=0; % '!’ and acton cannot be selected 
% END ACTON SELECTION 



4. NUMS 


% % NUMERICAL ANALYSIS AND JINI 

%nums calculates occurrence of a cell in the trajectory and Jini coeff. 

% N(:,l) lists cells, N(:,2) lists occurrence 
%open NAMES.mat; 

DDA=[]; srt=sort(AA); 

ls=length(srt); lss=ls-l; U=unique(AA);lu=length(U); N=zeros(lu,2); 

N(:,1)=U; j=l; W=0; 

for k=l :lss, %over srt, except the last 

if srt(k)==srt(k+l), W=W+1; else N(j,2)=W+l; j=j+l; VV=0; end 

end, 

N(lu, 2)= VV+1; 

ns=N(:,2)'; ns=sort(ns); lns=length(ns); 
lns=length(ns); JIN=[l:lns]; JIN(l)=ns(l); 
fork=2:lns, JIN(k)=JIN(k-l)+ns(k); end, 

SJ=sum(JIN); SE= (JIN(lns)*lu)/2; JINI=SJ/SE; 

D=sortrows(N,2); NMS=NAMES(D(:,1),:); ld=length(D); 

%D, lu, ’ns’, ns’, ’JIN’, JIN’, JINI 

%N 

%JIN 

% ddn OCCURENCE DISTRIBUTION 

disp(' '); disp ('content:'); disp(' '); 

DDA= D(:,l); 

%base=D(:,l)’; NAMES(base,:) %display content names only 
ldda=length (DDA); 

%DISPLAY WORD No, OCCURRENCE, WORD 

fori = l:ldda , disp ([num2str(i), ”, num2str(D(i,l) ),", num2str(D(i,2)),' ', 

NAMES( D(i,l),:)]), end 

%for i = l:ldda , disp ([ num2str(D(i,2)),’ ', NAMES( D(i,l), :)]), end 

cont=D(:,l); content=sort(cont); 

%for f=l:ldda, disp ([num2str(TT(f) ),' ’, num2str(TT(f,2)),' ', NAMES( AA(f,l), :)]), 
end 


%disp(’'), disp (['total content', num2str(ld)] ); disp ([’JINI = ', num2str(JINI)]); 



5. SAYCON2 


% Program saycon2 as script, can be used as function(content,connector) 

% saycon = “say_con(content,connector)” 

%Part 1: SPLIT: splcomplex cells 

%lists content, builds structure SPLIT: name, nleft, might 
%splits composite name into component NAMES 

wl=[];wm=[];wr=[];d=l; 

lcnt=length(content); SPLIT(l).name=[]; SPLIT(l).nleft=[]; SPLIT( 1 ).nright=[]; 

for k=ldent, SPLIT(k).name= cellstr(NAMES(content(k),:)); 
end; 

% SPLIT( : ,: ).name generates list of names; 

fst=[]; %for find str 

%sspl=size(SPLIT); ssp=sspl(2); 

csp=0; 

for k=l:lcnt, fst= lindstr (’', char( SPLIT(k).name) ) ; %find space between components 
%if no space: 

if isempty(fst)==l, SPLIT(k).nleft=char( SPLIT(k).name); SPLIT(k).nright=[]; %only 
left component 
%if space 

else csp=char( SPLIT(k).name); lcsp=length(csp); %totalNAME 

SPLIT(k).nleft=csp(l:(fst-l)); SPLIT(k).might =csp((fst+l):lcsp);fst=[]; 
end, 
end 


%Part 2: CONDENSATE 


disp (''); disp('condensate:'); disp (' ’); 

TRIPLET(1),L=[]; TRIPLET(1).R = []; TRIPLET(1).M =[]; 
TRIPLET (1). TRI=[]; 

trii = ['_7_7_7_7_7;']; 

nt=0; nnt=0; 
for c=l:lcnt, 


for m=(l:lcnt), 




dd= isequal( SPLIT(c).nleft, SPLIT(m).nright); 
if dd==l, d=d+l; 

wl=SPLIT(m).nleft; wr=(SPLIT(c).nright); wm =(SPLIT(c).nleft); else 

TRI2= wr,'; 

end, 

tt=isequal (TRI1,TRI2); % if not and if wr not empty : 
if tt=0, ttt=isempty(wr) ; 
if ttt=0, 

NC=[wl,'wm,]; NM=[wm,'', wr,]; 

%NC=char( SPLIT(c).name); 

[tfc, locc] =ismember ( NC, NAMES, ’rows'); 

%NM=char( SPLIT(m).name); 

[tfm, loom] =ismember ( NM, NAMES, ’rows'); 

if abs(locc-locm)<=5, 

% 5 is the range of narrow condensate 

% NARROW CONDENSATE IS MARKED WITH ASTERISK 

nnt=nnt+l; disp([TRI2, ’ * ’]), else 
disp (TRI2), 

%disp([ wl,'_',wm,'_', wr,'; ’ ]) , 
end 

nt=nt+l; 

end, 

TRI1=TRI2; 
end, %if tt==0, 

TRIPLET(d).L =wl; TRIPLET(d).R = wr; TRIPLET(d).M = wm; %for a record 
TRIPLET(d).TRI = [TRIPLET(d).L,'_’,TRIPLET(d).M,’_’, TRIPLET(d).R, ’; ’]; 

end, %for m=( I: lent) 

end %for c=l:lcnt, 

disp(' ’), disp (['total content ’, num2str(ld)] ); 

disp(['condensate ', num2str(nt)]);disp(['narr. cond.', num2str(nnt)]); 
disp (['JINI = ', num2str(JINI)]); disp (['cond/cont = ', num2str(nt/ld)]); 
disp (['nr.cnd/cnt = ', num2str(nnt/ld)]); 



5A. SAY CON 


% Program say con as script, can be used as function(content,connector) 

%fimction say_con = say_con(content,connector) 

%Part 1: SPLIT: splcomplex cells 

%lists content, bilds structure SPLIT: name, nleft, might 
%splits name into components 
wl=[];wm=[];wr=[];d=l; 

lcnt=length(content); SPLIT(l).name=[]; SPLIT(l).nleft=[]; SPLIT)l).nright=[]; 

for k=lTent, SPLIT(k).name= cellstr(NAMES(content(k),:)); end; 

% SPLIT): , : ).name generates list of names; 

fst=[]; %for find str 

sspl=size(SPLIT); 

ssp=sspl(2); 

csp=0; 

for k=T:lcnt, fst= lindstr ('', char) SPLIT)k).name) ) ; %find space between components 

if isempty(fst)==l, SPLIT(k).nleft=char( SPLIT(k).name); SPLIT(k).nright=[]; %only 
left component 

else csp=char( SPLIT(k).name); lcsp=length(csp); 

SPLIT(k).nleft=csp(l:(fst-l)); SPLIT(k).might =csp((fst+l):lcsp);fst=[]; 
end, end 

%Part 2: DOUBLETS AND TRIPLETS 

disp (’'); 

disp('condensate:'); 
disp (’'); 

% CONDENSATE SIZE %%%%%%%%%%% 

TRIPLET(1),L=[]; TRIPLET) 1).R = []; TRIPLET(1).M =[]; 

TRIPLET) 1).TRI=[]; 

TRI1 = [’_7_7_V_7_7;']; 
nt=0; 

for k=l Tent, mm= 1; 




for m=((mm+l):lcnt), 


dd= isequal( SPLIT(k).nleft, SPLIT(m).nright); 
if dd==l, d=d+l; 

wl=SPLIT(m).nleft; wr=(SPLIT(k).nright); wm =(SPLIT(k).nleft); else 

TRI2= wr,'; 

end, 

tt=isequal (TRI1,TRI2); if tt=0, ttt=isempty(wr) ; if ttt==0, disp(TRI2);nt=nt+l; 

%%%%%%%%%% 

TRI1=TRI2; end, end 

TRIPLET(d).L =wl; TRIPLET(d).R = wr; TRIPLET(d).M = wm; %for a record 
TRIPLET(d).TRI = [TRIPLET(d).L,'_’,TRIPLET(d).M,’_’, TRIPLET(d).R, ’; ’]; 

end, end 

disp(''), disp (['total content', num2str(ld)] ); 
disp(['condensate ', num2str(nt)]) 

disp (['JINI = ', num2str(JINI)]); disp (['cond/cont = ', num2str(nt/ld)]); 


6. STACK 

%script: stack 
ZZ=zeros(l,41); tt=0; 
for zz=0:0.1:l 

t = 0:pi/20:(2*pi); ZZ(l,:)=zz; tt=ZZ; 
plot3(sin(t),cos(t),tt), hold on, tt=tt+l; 
end, axis ([-1, 1, -1, 1, 0, 1.1]); 

7. TR 

%tr replays TRAJECTORY as NAMES 

% compare with tr3d, which replays 3D TRAJECTORY IN TIME 

%compare with repro: 3D TRAJECTORY IN SPACE ; replays AA trajectory in 

WORLD 

%NUMBERS AND WORDS 

ltra=length (AA); for i = 1 :ltra ntr = AA(i); disp ([num2str(i ),’ ’, num2str(ntr),’ ', 
NAMES(ntr,:)]) , end 

%NUMBERS 

% ltra=length (tra); for i = 1 :ltra ntr = tra(i); disp ([num2str(i ),' ’, num2str(ntr)]), end 



% TRAJECTORY TO SAVE AS TT 

T=zeros(2,n); ltra=length (AA); ntr=0; for i = 1 :ltra ntr = AA(i); T(l,i )= i; T(2,i) = ntr; 
end, TT= T’; 


8. TR3D 


% tr3d : replays AA as 3D TRAJECTORY IN TIME 
% compare with tr : replays TRAJECTORY as NAMES 
%compare with repro: 

%3D TRAJECTORY IN SPACE ; replays AA trajectory in WORLD 

n=length(AA); om=length(WW); 
h=2; %height 

figure, axis ([-1.4, 1.4, -1.4, 1.4, 0, 2.1]); view(-160, 55),hold on, 

%DISPLAY NUMBERS 

for zz=0:0.2:h, t = 0:pi/20:(2*pi); ZZ(l,:)=zz; tt=ZZ; 
plot3(sin(t),cos(t),tt), hold on, tt=tt+l; end 

zz=2/n; 

X0= [cos(angles(ain)) cos(angles(AA(l)))]; 

Y0=[sin(angles(ain)) sin(angles(AA( 1)))]; 

Z0=[0 zz]; 

plot3(X0,Y0,Z0,'Color',’black','LineWidth', 1 ); 

pause(O.l); 
nn=(n-l); 
for k=(l:nn), 

X= [cos(angles(AA(k))) cos(angles(AA(k+l)))]; 

Y= [sin(angles(AA(k))) sin(angles(AA(k+l)))]; 

Z= [zz*(k) zz*(k+l) ]; 

plot3 (X,Y,Z,’Color’,’black’,’LineWidth’, 2 ); hold on 

pause(O.l); 

end 

for j=l:om, text ( 1.15*008 ( angles (j) ) , 1.15*sin (angles(j)) , num2str(j),’FontSize’, 8), 
end 




Graphic output of tr3d 


9. REPRO replays AA (trajectory) 


EXAMPLE OF INPUT: 

ENTER: 

AA=[ 41 62 34 49 48 57 51 55 21 58 45 49 46 62 51 43 46 19 50 17 60 33 
58 40 62 47 37 19 39 62 38 28 52 62 56 49 45 49 38 41 11 62 37 45 37 28 
49 16 62 58]; repro; 

[ENTER] 


Ri: PRO. Ill 

%repro: 3D TRAJECTORY IN SPACE ; replays AA trajectory in WORLD 
% compare with tr3d, which replays 3D TRAJECTORY IN TIME 
%compare with tr, which replays TRAJECTORY as NAMES 

cd C:\MATLAB6p5\work\PIG; load NAMES.mat, load WW.mat, 
n=length(AA);ain=AA(l);nn=n-l; om=length(WW); zz=l/om; 

angles=(2*pi/om).*[0:om-l]; ZV=[0 : zz : l];%COORDINATES 














omm=om-l; figure, stack, axis ([-1.2, 1.2, -1.2, 1.2,0, 1.1]); view(-160, 55), 

%DISPLAY NUMBERS 

%for j=l:om, text ( l.l*cos ( angles (j) ) , l.l*sin (angles(j)) , num2str(j),'FontSize', 8), 
end 

%"ain" display 

text (0.9*(cos(angles(ain))), 0.9*(sin(angles(ain))),'o','color', 'green',’'FontSize’,20 ); 

% ASTERISK AND LINE DISPLAY; 

forjj=l:nn ; al=AA(jj); a2=AA(jj+l); 

% Randomization of the position of ASTERISK. 

u=rand/25; uu=rand; if uu<=0.5, uuu=l; else uuu=-l; end, u : 
v=rand/25; vv=rand; if vv<=0.5, vvv= 1; else vvv=-l; end, v=v*vvv; 
w=rand/25; ww=rand; if ww<=0.5, www=l; else www=-l; end, w=w*www; 

%ASTERISK 

text (0.9*(cos(angles(a2))+u), 0.9*(sin(angles(a2))+v), (ZV(a2)+w),'*','color', 

'red','FontSize', 14 ); 

pause(O.l), hold on % 

%LINE 

X1 =0.9*cos(angles(a 1 ))+u; Y1 =0.9*sin(angles(a 1 ))+v; Z1 =ZV(a 1 )+w; 

X2=0.9*cos(angles(a2))+u; Y2=0.9*sin(angles(a2))+v; Z2=ZV(a2)+w; 

X=[X1 X2]; Y=[Y1 Y2]; Z=[Z1 Z2]; 
plot3(X,Y,Z, 'Color','red','LineWidth’,l,'LineStyle','-'); hold on, 

% END ASTERISK AND LINE DISPLAY 
end %jj; 

disp([' n: ', num2str(n),' initial acton:', num2str(ain)] ) 

disp(’run tr for TRAJECTORY with names, type tr3d, for 3D TRAJECTORY') 



Graphic output of repro 


10. REPR02D , requires WW.mat and NAMES.mat 

%repro2d: 2D distribution of actons over NAMES 
% compare with tr3d, which replays 3D TRAJECTORY IN TIME 
%compare with tr, which replays TRAJECTORY as NAMES 

n=length(AA);ain=AA( 1 );nn=n-1; om=length(WW); 
angles=(2 *pi/om). * [0: om-1 ]; %COORDINATES 
figure, axis([-1.2, 1.2, -1.2, 1.2 ]); hold on 
t = 0:pi/20:(2*pi); plot(sin(t),cos(t)), 

%DISPLAY NUMBERS 

for j=l:om, text ( l.l*cos ( angles (j) ) , l.l*sin (angles(j)) , num2str(j),'FontSize', 8), end 
%"ain" display 

text ((cos(angles(ain))), (sin(angles(ain))),'o','color', 'green',’'FontSize',20 ); 

%ASTERISK AND LINE DISPLAY; 
forjj=l:nn ; al=AA(jj); a2=AA(jj+l); 


















% Randomization of the position of ASTERISK. 

u=rand/25; uu=rand; if uu<=0.5, uuu=l; else 
v=rand/25; vv=rand; if vv<=0.5, vw=l; else vvv=-1; end, v=v*vvv; 

%ASTERISK 

text ((cos(angles(a2))+u), (sin(angles(a2))+v), '*','color', 'red','FontSize',14 ); 
pause(0.1), hold on % 

%LINE , OPTIONAL 

%Xl=cos(angles(al))+u; Yl=sin(angles(al))+v; X2=cos(angles(a2))+u; 
Y2=sin(angles(a2))+v; 

%X=[X1 X2]; Y=[Y1 Y2]; plot(X,Y, 'Color','red','LineWidth', 1 ,'LineStyle','-'); hold 
on, 


% END ASTERISK AND LINE DISPLAY 
end %jj; 

disp( [' n: ', num2str(n),' initial acton:', num2str(ain)] ) 

disp(’run tr for TRAJECTORY with names, type tr3d, for 3D TRAJECTORY') 



Graphic output of repro2d 




Part 4 


Example of a batch command 

»C= 0.5; G= 1;H = 0.8; F= 0.8; 
init, n=25 

a=22; figure, stack, view(-160, 55), 

p3d3, disp(' '); disp(['AA=[', num2str(AA’), ’]; ’]), 

nums, say_con, disp(['ain ’, num2str(ain),', a fin ’ num2str(a)]), 

n=25; 

figure, stack, view(-160, 55), 

p3d3, disp(' '); disp(['AA=[', num2str(AA’), ’]; ’]), 

nums, content=D(:,l)'; say_con, 

disp(['ain ', num2str(ain),', a fin ’ num2str(a)]) 

Part 5 Structure of WORLD (62 x 62) 


AR: Arity 


No. 

NAME 

Neighbors (No.) 

Neighbors (NAMES) 

AR 

1 

J 

0 

| 

0 

2 

SOW 

29, 30, 35, 36 

sow old , sow pig , sow poor , sow sent 

4 

3 

old 

29 

sow old 

1 

4 

Pig 

30,31,32,33,34 

sow pig , pig three , first pig , second pig , 
third pig 

5 

5 

three 

31 

pig three 

1 

6 

first 

32,37,41,45,49, 
52, 56, 62 

first pig, sent first, first meet, first say , give 
first, first build, ask first, eat first 

8 

7 

second 

33 

second pig 

1 

8 

third 

34 

third pig 

1 

9 

poor 

35 

sow poor 

1 

10 

sent 

36, 37, 38, 39 

sow sent, sent first, sent away , sent seek , 

4 

11 

away 

38 

sent away 

1 

12 

seek 

39, 40 

sent seek, seek fortune 

2 

13 

fortune 

40 

seek fortune 

1 

14 

meet 

41,42 

first meet, meet man 

2 

15 

man 

42, 43, 46, 47 

meet man , man bundle, say man , man give 

4 

16 

bundle 

43,44 

man bundle, bundle straw 

2 

17 

straw 

44, 48, 50 

bundle straw, give straw, straw build , 

3 

18 

say 

45,46,58 

first say , say man , say no 

3 

19 

give 

47, 48, 49 

man give , give straw , give first 

3 

20 

house 

51, 54, 60 

build house , come house , blow house 

3 

21 

build 

50,51,52 

straw build , build house , first build 

3 

22 

wolf 

53, 55, 59, 61 

wolf come , wolf ask, wolf blow , wolf eat 

4 




23 

come 

53, 54 

wolf come , come house 

2 

24 

ask 

55, 56, 57 

wolf ask, ask first, ask enter 

3 

25 

enter 

57 

ask enter 

1 

26 

no 

58 

say no 

1 

27 

blow 

59, 60 

wolf blow, blow house 

2 

28 

eat 

61,62 

wolf eat, eat first 

2 

29 

sow old 

2, 3, 30, 35, 36 

sow , old , sow pig , sow poor , sow sent 

5 

30 

sow pig 

2, 4,29,31,32, 

33, 34, 35, 36 

sow , pig , sow old , pig three , first pig , 
second pig , third pig , sow poor , sow sent 

9 

31 

pig three 

4, 5, 30, 32, 33, 

34 

pig , three , sow pig , first pig , second pig , 
third pig 

6 

32 

first pig 

4, 6, 30, 31, 33, 
34,37,41,45,49, 
52, 56, 62 

pig , first, sow pig , pig three , second pig , 
third pig , sent first, first meet, first say , 
give first, first build , ask first, eat first 

13 

33 

second pig 

4, 7, 30, 31, 32, 

34 

pig , second , sow pig , pig three , first pig , 
third pig 

6 

34 

third pig 

4, 8, 30, 31, 32, 

33 

pig , third , sow pig , pig three , first pig , 
second pig 

6 

35 

sow poor 

2, 9, 29, 30, 36 

sow , poor , sow old , sow pig , sow sent 

5 

36 

sow sent 

2, 10,29,30,35, 
37, 38, 39 

sow , sent, sow old , sow pig , sow poor, 
sent first, sent away , sent seek 

8 

37 

sent first 

6, 10, 32, 36, 38, 
39,41,45,49, 52, 
56 12, 62 

first, sent, first pig , sow sent, sent away , 
sent seek , first meet, first say , give first, 
first build , ask first, eat first 

12 

38 

sent away 

10, 11, 36, 37, 39 

sent, away , sow sent, sent first, sent seek 

5 

39 

sent seek 

10, 12,36,37,38, 
40 

sent, seek , sow sent, sent first, sent away , 
seek fortune 

6 

40 

seek fortune 

12, 13, 39 

seek, fortune , sent seek 

3 

41 

first meet 

6, 14, 32, 37, 42, 
45, 49, 52, 56, 62 

first, meet, first pig , sent first, meet man , 
first say , give first, first build , ask first, eat 
first 

10 

42 

meet man 

14, 15,41,43,46, 
47 

meet, man , first meet, man bundle , say 
man, man give 

6 

43 

man bundle 

15, 16, 42,44, 46, 
47 

man , bundle , meet man , bundle straw, say 
man, man give 

6 

44 

bundle straw 

16, 17, 43,48, 50 

bundle , straw , man bundle , give straw , 
straw build 

5 

45 

first say 

6, 18, 32,37,41, 
46, 49, 52, 56, 58, 
62 

first, say , first pig , sent first, first meet, 
say man , give first, first build , ask first, 
say no , eat first 

11 

46 

say man 

15, 18,42,43,45, 
47, 58 

man , say , meet man , man bundle , first 
say , man give , say no 

7 

47 

man give 

15, 19, 42,43,46, 
48,49 

man , give , meet man , man bundle , say 
man , give straw , give first 

7 

48 

give straw 

17, 19, 44, 47„ 

49, 50 

straw , give , bundle straw, man give , give 
first, straw build 

6 

49 

give first 

6, 19, 32,37,41, 
45, 47, 48, 52, 56, 
62 

first, give , first pig , sent first, first meet, 
first say , man give , give straw , first build , 
ask first, eat first 

11 

50 

straw build 

17,21,44,48,51, 

52 

straw , build , bundle straw, give straw , 
build house , first build 

6 

51 

build house 

20,21,50, 52, 54, 
60 

house , build , straw build , first build , come 
house , blow house 

6 

52 

first build 

6,21,32,37,41, 
45,49, 50,51,56, 

first, build , first pig , sent first, first meet, 
first say , give first, straw build , build 

11 






62 

house , ask first, eat first 


53 

wolf come 

22, 23,54,55, 59, 
61 

wolf, come , come house , wolf ask, wolf 
blow, wolf eat 

6 

54 

come house 

20,23,51,53,60 

house , come , build house , wolf come , 
blow house 

5 

55 

wolf ask 

22, 24, 53,56, 57, 
59, 61 

wolf, ask , wolf come , ask first, ask enter , 
wolf blow, wolf eat 

7 

56 

ask first 

6, 24, 32,37,41, 
45, 49, 52, 55, 57, 
62 

first, ask , first pig , sent first, first meet, 
first say , give first, first build , wolf ask , 
ask enter , eat first 

11 

57 

ask enter 

24, 25, 55, 56 

ask , enter , wolf ask , ask first 

4 

58 

say no 

18,26, 45,46 

say , no , first say , say man 

4 

59 

wolf blow 

22,27, 53,55, 60, 
61 

wolf, blow, wolf come , wolf ask, blow 
house , wolf eat 

6 

60 

blow house 

20,27,51,54, 59 

house , blow , build house , come house , 
wolfblow 

5 

61 

wolf eat 

22,28, 53,55, 59, 
62 

wolf, eat, wolf come , wolf ask, wolf 
blow, eat first 

6 

62 

eat first 

6,28, 32,37,41, 
45,49, 52, 56,61 

first, eat, first pig , sent first, first meet, 
first say , give first, first build , ask first, 
wolf eat 

10 


Last revised: November, 2010 

email 




















PATTERN CHEMISTRY OF LANGUAGE 


Yuri Tamopolsky 


The links presented here refer to a series of e-publications PATTERN CHEMISTRY OF 
LANGUAGE. It is an attempt to apply Pattern Theory of Ulf Grenander (Brown 
University; see numerous links on the Web) to the atomistic aspects of language. While 
Pattern Theory is a mathematical discipline, Pattern Chemistry is a small contribution to 
its big domain by a chemist with a close connection to Ulf Grenander’s work and a 
lifelong interest in languages, some as different as Russian, Hebrew, Japanese, and 
Hungarian, apart from main European ones. 

The story of this series should better be told from the first person. 

I am an organic chemist, Ph.D., who happened to have a lot of spare time to thi nk about 
matters outside my immediate profession. In the late 1970s, when the political 
atmosphere in Soviet Russia was growing tenser by the day, I was especially intrigued by 
the problem of stability of social structures and the further fate of the strained social 
system. Much later, this direction of thought resulted in “ History as Points and Lines ,” 
initiated and co-authored by Ulf Grenander. The same question of stability can be asked 
about the structures of language in the process of historical evolution as well as 
individual language acquisition. 

In 2002,1 ran into The Atoms of Language by Mark C. Baker. I took his question “what 
if the words were atoms” completely serious, which was quite natural in the framework 
of Pattern Theory. That was the initial impetus to the current line of work. 

Contrary to a wide spread view, chemistry is not just a description of millions of 
substances and their transformations. The central organizing idea of chemistry, borrowed 
from physics, is that between an initial stable state of atoms connected in a particular way, 
and the final stable state of the same atoms connected in a different way, there is an 
unstable fleeting state of transition (transition state) from one to the other. The second 
theoretical idea is that if we have a structure of atoms connected with bonds in a certain 
order, then any other arrangement of the same atoms can emerge spontaneously. In fact, 
chemical reactions result in only a few products because the rest of transformations are 
negligibly slow. The third idea is that the speed of transformation depends on the height 
of the transition barrier between the initial and final state. Pattern Theory provides a 
measure of the stability of abstract structures of any origin, including those of thought 
and its expression. 

Obviously, these ideas can be easily generalized over other objects built of atomic entities 
and connections between them, such as structures of language. Pattern Theory from the 
point of view of a chemist is exactly this kind of generalized chemistry. If applied to 




language, Pattern Chemistry regards its generation and evolution as a natural process 
governed by relative stability and kinetics of linguistic structures. 


Pattern Theory of Ulf Grenander studies structural complexity regardless of 
interdisciplinary boundaries. It reduces structure to a set of atomic entities (generators) 
selectively connected by bonds, thereby representing observable objects of widest variety, 
including language and thought. The unusual aspect of Pattern Theory is its metrics 
which allows for distinguishing between more and less stable (i.e., probable) structures. 
Pattern Chemistry focuses not so much on stable structures as on the fleeting transition 
states between them, similarly to the way chemistry treats molecular transfonnations, 
making distinction between fast and slow transfonnations. 

The parallel between linguistics and chemistry has been a standard explanatory model 
since the discovery of the DNA’s structure and its ability to carry a protein 
"meaning." Besides, chemistry uses a particular language (chemical nomenclature) to 
convert complex non-linear structures (“chemical thoughts”) into a linear word which can 
potentially be communicated through speech. 

The central ideal of Pattern Chemistry is that complexity in nature and society evolves 
from simple states by simple steps. This perspective of complexity differs from the 
concept of dynamic complexity represented by systems of differential equations in the so- 
called “order-from-chaos” theories. Indeed, this is an “order-from-order” concept, more 
in line with the Darwinian paradigm of evolution. It seems interesting to investigate 
language acquisition, generation, and evolution in the discrete stepwise way, which is 
very similar to both ontogenesis/phylogenesis in biology and chemical transformations. 

Thus, e-papers SALT and SALT 2 explore the stepwise extraction of grammar from the 
text of the Hungarian folk tale “Salt,” basing on extremely simple rules. I used the 
Hungarian text in order to eliminate all semantic associations natural for a native speaker 
of English and to emphasize the universality of the model regardless of grammar and 
vocabulary. 

The most recent e-publication in this collection, entitled, maybe too ambitiously, 
PATTERN CHEMISTRY OF THOUGHT AND SPEECH, comes closer to the 
evolutionary problems that any of the preceding ones. 

Human thought is not directly observable, apart from vague shadows in brain scans. We 
simply do not know what it is. We have nothing to support the belief that we think in 
linear sequences of whatever units, but we definitely speak in them. Moreover, the 
syntactic trees are definitely non-linear. The paper further explores a hypothetical 
protolanguage, called Nean, in which the simple elementary thought consisting of two 
connected entities directly translates into the simplest elementary phrase consisting of 
two words. Nean, unlike extant languages, is topologically identical with thought. 



Nean sounds like a repetitive random series of elementary doublets. Two doublets with 
common element can be combined into linear triplets. The e-paper explores the ability of 
this inherently linear and primitive language to express more complex non-linear 
thoughts by means of the process of linearization. It appears that Nean, subjectively, is 
quite expressive, in spite of its primitiveness. 

I present a computer simulation, based on simple principles, which represents thinking 
and speech as competition of alternative thought structures for a spot in consciousness 
and further generation of linear speech-ready expressions longer than elementary 
doublets. I use the story of the Three Little Pigs as the substrate of the process and 
discuss the potential of Nean for further complexification and grammaticalization. 

I regard Nean as a point of evolutionary divergence between thought and speech, both 
initially linear, after which a variety of non-linear grammars can emerge. The current 
collection of e-papers, or, rather, essays, illustrates the way to this idea, starting from 
MOLECULES AND THOUGHTS. 

Kinetics is the study of speed. In short, the main idea is: we say what we can say faster 
and we say faster what we can faster convert from nonlinear thought into linear speech. 

Unlike distant worlds and invisible atoms, language is as directly observable, rich, and 
exciting as the beauty of nature. Not accidentally so many outsiders felt the irresistible 
pull of linguistics and tried their mind in solving its puzzles. I believe that the current 
collection, coming from remote outskirts of liguistics, can illuminate the problems of 
evolutionary linguistics and intimate mechanisms of verbal communication from a kinetic 
angle and in chemical light. 

NOTE: The kinetic direction has already been tested in linguistics by Martin 
Nowak et al who invoked the fundamental idea of competitive kinetic evolution 
(Manfred Eigen). 


LINKS 


1. Molecules and Thoughts: Pattern Complexity and Evolution in Chemical 

Systems and the Mind , 2003 ( or this and this) . 

2. TIKKI TIKKI TEMBO and the Chemistry of Protolanguage , 2004 (or this) . 

3. Pattern Theory and “Poverty of Stimulus” Argument in Linguistics ,2004 (or this) . 

4. The Three Little Pigs : Chemistry of language acquisition ,2005 (or this) . 








5. Salt: The Incremental Chemistry of Language Acquisition , 2005 (or this). 

6. Salt 2: Incremental Extraction of Grammar by Simplistic Rules , 2005 (or this) . 

7. The Chemistry of Semantics , 2005 (or this) . 

8. Do Piraha speak Nean? , 2007 (or this) . 

9. PATTERN CHEMISTRY OF THOUGHT AND SPEECH , 2010 (also here) . 

9A . APPENDIX to PATTERN CHEMISTRY OF THOUGHT AND SPEECH: 

MATLAB codes , (also here) . 


EMAUt: SIWfSPIRPSPERQ.NET 


MAIN PAGE: http://spirospero.net/complexity.htm 
HOME PAGE: http://spirospero.net 


November, 2010 













1 



Pattern chemistry of the origin of mind 


Yuri Tarnopolsky 


This project evolves 


In this essay I attempt to formulate a hypothetic mechanism of spontaneous emergence of 
complex systems. I take mind as a typical example. Other possible particular cases range 
from the origin of planets to human history and, to jump down the scale, human 
individuality. By complex system I mean exystem : Evolving Complex SYSTEM. 

The conceptual foundation for treating such “too big to succeed” problems is Pattern 
Theory (Ulf Grenander), which represents the abstract structure mathematically, i.e., 
regardless of tangible properties yet never losing touch with reality. 1 

I cannot give here even a short review of the principles of Pattern Theory except for 
saying that from the point of view of a chemist it is a perfect abstract chemistry of 
anything consisting of atomic entities (generators) and connections (bonds) between 
them, with probabilities (or energies) attributed to either. Patterns are open-ended 
collections of similar configurations. 2 Such configurations are exemplified by 
molecules. What I call pattern chemistry deals with intimate mechanisms of 
transformations of configurations. I list major sources in Appendix 1. 


1 Two most relevant and more accessible books by Ulf Grenander are Elements of Pattern Theory (1996) 
and A Calculus of Ideas (2012). 

2 Groups under similarity transformation on a set of configurations. 


















2 


I can skip introductory explanations because I start ab ovo and use mostly visual 
representation typical for chemistry and engineering. As a chemist, I perceive the world 
in terms of atomic entities (generators of Pattern Theory or atoms of chemistry) and 
connections between them. This can be expressed as a connectivity matrix, but I prefer to 
visualize not only the structure but also the process of its construction, as if it were a 
machine or a bridge. 

In Appendix 2,1 tell the story of the origin of this essay. I do it because the best way to 
understand something is either to build it or to study its history. For such intriguing 
exystems like life, mind, and society building is not an option. 

The most important circumstance in my story is that I am neither a mathematician, nor 
computer scientist, nor a historian but just a chemist with wide interests outside my field. 
In my current field of pattern chemistry I am alone and without any need of grants, peer 
reviews, and tenure can frolic as wild as it gets. Regarding Pattern Theory, however, I 
have been lucky to enjoy discussions with Ulf Grenander, as well as his attention and 
support, for a long time. 


PART ONE: HISTORY AS EXPLANATION 


THE PROBLEM 


The origin of the problem of spontaneous emergence predates recorded human history. It 
generated mythological and religious explanations, practically all being description of 
making or building by a mythological figure. The scientific inquiry seems to get a start by 
the first half of the twentieth century when theoretical physicists began to ask two 
separate questions: 

Question 1. How is life physically possible? 

Question 1 was answered in a very general form by Erwin Schrodinger (1887-1961) in 
What is life? The Physical Aspect of the Living Cell (1944). The modern answer, shaped 
by Ilya Prigogine (1917-2003) in the second half of the twentieth century, describes life 
as a dissipative structure far from equilibrium, which needs a constant supply of free 
energy. 3 


3 Free energy is directly convertible into work, as, for example, chemical energy, light, and electricity, but 
not heat. 




3 


Question 2. Since life is so complex and spontaneous origin of complex 
systems is improbable, how could it spontaneously emerge? 

Question 2 implies two plausible assumptions, which I share. 

2A. Life is complex. 

2B. Spontaneous origin of complex systems is improbable. 

Both questions 1 and 2, in my opinion, demonstrate what I call synchronic approach, 
which I do not share: origin is considered a single event with a beginning and an end. 
Yesterday there was no life and today—bang!—it is here. There is, however, a 
different—diachronic—approach, which I prefer: origin is a sequence of events lining up 
as the history of the object, in our case, still ongoing. It is always debatable which event 
can be called origin, where the sequence ends, and whether it ends at all. 

Simple objects, I presume, do not necessarily exhibit anything long enough to be called 
history. They can emerge spontaneously because probability of a conjunction of a few 
favorable conditions can be substantial. Not only that, but simple systems can go through 
the same states again and again in the same or different order with the same average 
probability of each state 4 . Thus, a chemical transformation occurs spontaneously because 
it involves a few atoms and bonds. Even if the molecule is very big and if we somehow 
label individual atoms, they bond and split indefinitely in equilibrium. A large complex 
structure, however, can appear only once, diachronically, and in a sequence of simple 
steps. Origins of planets and of humanity on one of them seem to be two extreme 
examples of diachronicity. Next, I will consider something less grand than either of 
them. 


EVENT AND PROCESS 


While we can understand many physical, chemical, and biological objects and systems 
per se regardless of their origin and history, to understand what society and civilization 
are at a certain point requires significant knowledge of how they looked some time before 
that. The fresh example, while I am writing this (October, 2012), is the US Presidential 


4 


It is called ergodicity. Exystems cannot be ergodic. They emerge, grow, decline, and die only once. 



4 


Elections. To understand what is going on in 2012, we need to go back to Elections 2008 
and further back to at least 1994, the beginning of the Republican Revolution. 

Understanding is a process in the mind, a kind of intense, sometimes stressful small-scale 
evolution. To understand means to come to a relaxed stable state of mind when no big 
questions remain. Once we have understood something, we normally cannot un¬ 
understand it. 

With the yearbook of history opened on page 1994 we would still have important 
questions about the origin of the Republican Revolution and would have to go back to the 
Franklin D. Roosevelt Revolution better known as the New Deal, to which the former is, 
probably, a counter-revolution asking for a new revolution—a typical pattern of history. 

Some questions could be addressed to even earlier periods, and the American Civil War 
is most relevant. What is important, the more and more distant past would be less and 
less relevant to the initial question about Elections 2012. Some elementary knowledge of 
world history will do, while the origin of life and humans on earth would be completely 
irrelevant, unless in the obscurantist politics of the religious right. This is the essence of 
diachronic analysis. 

How the mind originated and how it works is a subject of tens or hundreds of thousands 
printed and digital pages. The encyclopedic How the Mind Works by Steven Pinker 
(1997), is over 650 pages long consists mostly of words and begins with the honest “we 
don’t understand how the mind works.” The book of the same title by Carlo Lazzari ( 
2007), 121 pages, contains mathematical and graphic material, symbolically, with lots of 
white space. Daniel Kahneman’s “Thinking Fast and Slow”( 2 on), almost 500 pages, is 
about how foolish a mind can be, and has just a couple of charts. 

Science of the mind is a very large, complex, diverse, fragmented, and unsettled subject, 
itself in the process of fast evolution. Some central terms still remain undefined and 
crucial problems unsolved, which makes it all the more interesting. I am not going into 
details here, however. With so much money on modern mind, for the mind to understand 
itself is like to win a basketball game on full stomach. 

Recently, I have bumped into the ongoing study of the little worm C. elegans, including 
the details of its behavior and connectome—the map of connections between the cells of 
nervous system. It was an additional stimulus to go back to my old idea. The connectome 
research, very high tech, goes far beyond simple creatures, embracing brains of humans 
and even their babies. 5 I think that there could be a way to test the ideas of this essay not 
only on the worm, but also on other detailed records of evolution, such as history. See 
Appendix 3. 

My pattern of thinking is entirely chemical and engineering in nature, which is different 
from the way most mathematicians and physicists think. It also has poetic elements, like 
the habit of metaphor, which makes me feel at home among patterns. This is the only 


5 See a remarkable video as an introduction to the field. 






5 


natural way for me and I will do it in my ideographic manner: by drawing pictures on the 
sand with a twig. Same applies to the style. Patterns and metaphors do not know borders. 
And if 1 go overboard sometimes, so be it. 


CONTAMINATION BY MIND 


I am interested in a highly general and abstract problem: the origin of complexity from 
simplicity. It resonates, in a pattern way, with the reversed ancient paradox “heap.” 

Direct version: 1,000,000 grains of sand is a heap of sand (Premise 1) 

A heap of sand minus one grain is still a heap. (Premise 2) 
Conclusion: 1 grain of sand is a heap. 

Reversed version: 1 grain is 1 grain. 

2 grains is 2 grains. 

What number of grains becomes a heap? 

I am asking not when simplicity turns into complexity—there is no answer—but how it 
happens. 

I propose the following hypothetical twofold principle of simplicity: 

1. Spontaneous origin of simple systems is probable. 6 7 

2. Complex systems spontaneously originate from simple systems by a sequence of 
simple steps. 

This kind of reasoning has a counterpart in the method of complete mathematical 
induction and in recursive functions. 

Example: Long molecules of nucleic acids and proteins, as well as polymers in general, 
are synthesized in organisms by repeating the same simple steps with relatively simple 

7 

blocks and operations. 

Here the term spontaneous means: without a participation of human or divine mind. I 
apply the same term to the evolutionary origin of cognition. The term spontaneous does 
not apply either to newborns immersed into human environment, nor to robots designed 


6 It is not so simple. For example, sodium spontaneously bums in chlorine, 2 Na + CL 4 2 NaCl, but it 
is impossible to do it without a chemist’s participation. Somebody must bring the two elements in contact. 
A quantum system is a better example—but not perfect. 

7 There is a chicken-or egg discussion about the beginning of life. I believe, life started with 
polyphosphoric acids and was kept off equilibrium by tidal and diurnal cycles. 




6 


by human mind. Spontaneous is anything that has nothing to do with human mind or, as I 
would put it differently, mind-sterile. Artificial Intelligence, for example, reeks of 
human mind. 

Unfortunately, the words spontaneous and accidental are both contaminated with human 
mind that sets the baseline of what is planned, regular, and unavoidable. I would rather 
get used to mind-sterile. Not the mindless, indeed. There are only two mind-sterile media that 
I know: the blind chance generated by microscopic physical processes and the unbending 
macroscopic regularity of the solar system. In a way, human civilization is the drive to 
control—i.e., contaminate—blind chance. We corrupt it with order and it fights back. 

Any design of the mind that contains an algorithm is already contaminated with human 
mind and cannot serve as a model of spontaneous emergence. 

Is it possible to create sterile artificial conditions for a natural intelligence? The 
closest but still distant approximation could be creating the simplest artificial 
mind, just two cells, and letting it evolve to the human level by the principle of 
simplicity in animal or human simulated or duplicated environment by a simple 
recursive algorithm. The problem is that the mind will need not just a body, but a 
company of peers. This is so-o-o-o complex but maybe doable. I think that Ulf 
Grenander’s GOLEM (in A Calculus of Ideas) was the first step in this direction. 
Some starting elements of my approach can be found in complexity : the content 
of an empty mind can be filled with ideas by an extremely simple procedure based 

Q 

on the concept of novelty. My main idea is: forget the algorithms, let “it” live its 
own life and play the game of chance, win a new bone in its own skeleton, pay the 
loss by death, and have a history. 


PART TWO: EMERGENCE OF COMPLEXITY 


In this section I will try to dehumanitize the model of emergence of cognition by the 
disinfectant of randomness for as much as it humanly (oops!) possible. 

Let us consider the process of turning a grain of cognitive sand into a heap, step by step. 
It will take some page space, but in the end I will compress the symbolism to letter “h” 


Pattern Chemistry of Thought and Speech and their Hypothetical Ancestor and Molecules and Thoughts . 






7 


as an ideogram for “combinatorial branching” with the main I and secondary ~\ 
branches: h . 


STEP 1. POINTS AND LINES 


I start with a set of receptive points open to the stimuli of the external world. The 

simplest system of representing the world consists of one point 
capable of being in two states. I take a more complex but still 
simple case of a four point system for representing external world. 

My own representation is by no means formal and rigorous. In 
Pattern Theory a visual representation of structure consists only of 
points (generators) and lines (bonds), although a generator can 
have its own structure. I am describing a sloppy vague template to 
be deformed and hardened into something more formal, consistent, 
and abstract. At this point it is not really necessary. 

Figure 1 shows an assembly of four sensitive points and four lines. When acted upon, the 
cell generates a signal traveling through a line portrayed by an arrow. This and 
subsequent figures are only structures (configurations), in which the nature of points and 
lines does not matter. 

Simphcity is important for spontaneous emergence by bonding between four simplest 
singular points. 9 Obviously, its lines (bonds) are only potential because there is nothing 
at the other end. We will come to the arrow targets later. 

The four points and lines in Figure 1 can serve as a primitive analyzer and their outputs 
could be used in various ways. 

How can it become complex? 

Obviously, it can happen by developing complexity “downstream,” so that the emerging 
“mind” can grow beyond singular sensitive points and represent more complex external 
situations. For the mind they are just combinations of singular inputs. The difference 
between two states of the world is combinatorial. 



Figure 1. 
Points and lines 


9 Probability/improbability of spontaneous emergence is, possibly, a way to define simplicity/complexity in 
an incomplete but pragmatic way. The weakest point of all our reasoning about mind is definitions given by 
another mind. Compare with Essay 58 . 








8 


STEP 2. COMBINATORY DERIVATION 


Next, we form, top-down, combinations of signals from the primary sensitive points. We 
deal with three abstract operators: black circle #, ring O , and square H . For example, 
they can mean logical operators AND (multiplication, x ) , OR (addition, +) and NOT 
(negation, — ). They can mean any other operation, not logical at all, for example, a 
formation of a neurophysiological contact, or a particular movement, or only its direction, 

or a spoken phrase. 


In Figure 2, we form five horizontal 
lines (red arrows), numbered as 2.1 to 
2.5, each corresponding to a 
combination of the lines of the first 
generation. Their logical meaning is 
in the right column. A single ® means 
just identity, YES, or “presence.” 



2.1 

1 

2.2 

1+2 

2.3 

1x2 

2.4 

(l+4)x3 

2.5 

lx(—2)x(3+4) 


Figure 2. Combinatory derivation 


An operation does not influence the 
function of the corresponding original 
(vertical black) line, which can 
participate in further derivations. The 
number (5) of derivations in Figure 2 
is arbitrary. 


STEP 3. THE SCALE 


Figure 3 illustrates the formation of the vertical red lines of the second generation, 
which join the array of the original black lines. 

The lines of the second generation are subject to a new derivation resulting in vertical 
green lines of the third generation. 

In order to simplify the symbolism, I use the mesh rectangle to indicate the area of 
combinatorial derivation without specifying the operators. 















The buildup of complexity can be continued further as alternation of stages of 
combination and derivation, as well as an expansion of the primary sensitive points. 


9 



TOWARD ULTIMATE MIND-STERILITY 


The mathematical representation of the process of expansion exists: it is the scale of sets 
described in Bourbaki’s Theory of Sets 10 (1970), p. 383 , See Appendix 4. 


10 Nicolas Bourbaki, Elements of Mathematics: Theory of Sets, Addison-Wesley, originally published by 
Hermann (Paris), 1968 
























































































































































































































10 


In short, there is a basic set P. The next step of expansion includes the basic set P and all 
combinations of its members. The next step uses the previous set as a new base set. And 
so on. The scale quickly goes into combinatorial explosion. The attractiveness of the 
original scale of sets, however, is that no combination can be missed and the process can 
be completely random. But its enormity needs some filters. Here is the most mindless 
one: let us combine elements of the lower step at random, preventing some or most from 
entering the next step. The result will be a subset of the scale of set: a sparse scale of sets 
with many holes in it. The sparseness can be maintained also at random. The random 
death of points and lines may or may not lead to any particular direction of evolution, 
such as, for example, the growth of complexity. Randomness is the only disinfectant in 
the science of emergence. 

I am moving toward the main mind-building mechanism and this is a good opportunity for a reader to 
exercise his or her mind and predict from where it could come. 

Scale of set is a curious and even intimidating creation of the mathematical mind. There 
are about 150 related web pages—a microscopic volume—and a good part of them are 
my own pages. As soon as we imagine a real process of a random growth of some 
biological network, sparse scale might look tamer and more like the very ideogram of 
evolution. 


After the principle of a step has been described, the graphic symbolism can be further 
simplified, as shown in Figure 4. 



h 


C D 


Figure 4. Compact symbolisms of derivation 

The derivations do not need to be divided into sharply distinct generations. The 
combinations can include earlier generations in random fashion, as shown in Figure 5. 
We can also consider a gradual random dying out of much earlier generations in 
subsequent derivations, as in my example with American history. 


Next, we allow more randomness, up to complete one, of both combinations and 
derivations. Randomness means exclusion of participation of a second mind because the 





























11 


only thing neither mind nor computer can generate is random number, which can be a 
negative definition of randomness. 11 

My last remark means that perfect randomness does not exists in any sufficiently 
complex system because if there is no participating mind or computer, a complex system 
cannot be in any of its states with equal probability. It is possible only in physical 
abstractions, like ideal gas. There are constraints of size on complex systems in a real 
world with Euclidean metrics. 




Next step of simplification is the simplification of 
operations. Apparently, the absolute minimum is 
two, approximated as YES and NO. There is no 
reason to expect that nature somehow learned 
mathematical logic or information theory. The logic 
of natural systems can be fuzzy or, let me say, 
“analogic.” Ulf Grenander’s Pattern Theory, in my 
non-professional opinion, introduces probabilistic 
logic and thus raises a step above over the digital 
mainstream. 

I am aware of the term “neuro-fuzzy” applied to a 
hybrid of fuzzy logic, with connectionist neural 
networks, but not more than that. I am not good at 
mathematics and my mind operates by similarities, 
connections, and images. I cannot go into this 
unfamiliar to me field. Instead I am giving some 
engineering examples in Figure 6. 


If we attribute weights to lines and thresholds to derivations, with appropriate probability 
distributions, operations OR, AND, and NOT will become not random, but semi- 
deterministic. Figure 6 should be understood not literally, but as analogy. The operators 
work by weights (W) and thresholds (Th) . In some cases the outcome is clear, but when 
W and Th are close, the dice is rolled. The operators are so simple that each has some 
chances of spontaneous assembly. Nevertheless, the whole picture in Figure 3 does not 
look sterile enough to me because I have designed it. It needs the final purging touch. 


11 While the mind is growing, its mere size can create regularity in the form of gradients, limitations, and 
selective enhancements. Topology of the world, the order of the previous history, and the constraints of the 
scull may influence the structure of the mind (not to mention brain) at any step. However, the other mind is 
not to be blamed for that. 







12 







Figure 6. Analogies 


HERE COMES DARWIN 


The above picture has one big flaw: there is no guarantee that the expansion would go 
far enough because of the possibility of equilibrium between its elements and their 
aggregates, in other words, between simpler blocks and larger blocks of blocks. There is 
no law of nature which would ensure the complexification, especially because 
chemical—not pattern-chemical—realism predicts this kind of equilibrium for any 
bonding. Life and equilibrium are incompatible. 

Evolution of life is unthinkable without death (or elimination in some form) as a selecting 
hand. Darwinian selection is the main guiding mechanism I hinted at earlier. Since the 
substrate if the mind is living tissue, the points and lines are no exception. 

Regarding selection, we know about selection of life forms more than about selection in 
the evolution of the mind. There are several plucking and weeding hands capable of 
natural selection: diurnal and tidal cycles, daily fluctuations of temperature and humidity, 
seasonal cycles, and ultimately large scale random but frequent enough bouts of 
extinction like meteorites, volcanoes, droughts , floods, etc. These hands can rock the 
cradle of life—keep it off equilibrium—with the purpose of not letting it fall asleep. 

Note that this mechanism does not mean that all organisms move toward complexity. 
What it does is to make complex organisms possible. Bacteria coexist with humans. 

The weeding moves each species to its niche of stability. Evolution is the survival of not 
the fittest but the most stable. 





13 


The Darwinian selection of thoughts seems to me a very material mechanism. All is 
needed for selection is memory and a constraint in the form of a limited resource. The 
competition for the glucose in the blood flow is the most probable mechanism of 
selection of thoughts. I explored some primitive models of this kind in Molecules and 
Thoughts and Pattern Chemistry of Thought and Speech , Section Thinking. 


PART THREE: THE h-HYPOTHESIS 


I can now formulate the “h-hypothesis” (“h” as in Figure 4D) of the origin and evolution 

of the mind, which also covers the origin and evolution of life and society: the pattern 
12 

tricks are the same. 

1. The emergence of complexity in the mind is a random reversible (not in 
thermodynamic sense) alternation (mutation) of combinations and derivations. 

2. It starts with the simplest minds, capable of spontaneous assembly, and continues 
under the pressure of Darwinian selection so that only those points and lines survive that 
increase the stability of the mind. 

3. Geological events keep the system far from equilibrium. 

4. A remarkable paradoxical property of the h-hypothesis is that while the mind expands, 
its content still converges to most abstract and therefore useless in “street life” ideas. 
Mind works like a trash compactor. But this is why a scientist is, preferably, not a man 
from the street. 

4. The mathematical representation of the process is an incomplete, mixed, torn, and 
sparse subset of Bourbaki’s scale of sets. 


12 Compare with: Yuri Tarnopolsky and Ulf Grenander, History as Points and Lines . History is a 
succession of destructive and creative waves, but the products of old creative waves coexist with the most 
recent ones. The current coexistence of Amazonian tribes with sophisticated West European societies, 1% 
of super-rich with 15% poor in the USA, and the intellectual elite with the anti-Darwin and anti-science 
warriors in the same blessed land is the artifact of the scale-like social evolution. 







14 


5. A possible way to test the h-hypothesis is to see if simple minds, like that of C. 

1 T 

Elegance can conform to it. 

But does C. Elegance have a mind? 

I am a big admirer of many Douglas Hofstadter’s ideas and of his poetic imagination. 

One among them has an immediate bearing on the above question. Douglas Hofstadter 
believes that any organism has a “soul,” but they are very different in size, increasing 
from bacteria to humans. 14 I agree, substituting “mind” for “soul,” although I would not 
mind “soul” either. Have I really said that? This view of the world is very abstract, but only 
the high abstraction goes to the deep bottom of things. 

Why can we understand the world? How can we generate new ideas? How can we 
successfully survive in the complex world? My answer is: it is the matter of size. Since 
the mind is converging by generating the more and more abstract lines coming from 
sensors, they shrink in number with each new level of abstraction and if relevant ones are 
concentrated in the mind, there are so relatively few of them that reasoning occurs 
spontaneously and fast, without any algorithm. This is the conclusion one can draw 
from Ulf Grenander’s representation of the mind. Therefore, the “slow thinking” of the 
psychology of rationality is just a long sequence of fast stages: the size (quantity) matters 
and quantity translates into quality, as Georg Hegel told us long ago. Is America too big or 
too small to fail or to stand? 

The chemists, who routinely deal with extremely complex structures and transformations, 
simplify the problem of size-related complexity in a different way. They know that a 
single act of chemical transformation usually involves only a few atoms in close 
proximity. They look only at the most probable hot spots and ignore the rest of the 
structure. 

Whether it is a little worm or a theoretical physicist, as soon as the representation of the 
problem in the “mind”—or the mind—is reduced to a small enough size, random pattern- 
chemical recombinations can generate a very few alternative solutions to choose 
among. 15 

The above h-model looks like it addresses only half of the mind problem: representation 
of the world. There is the other part: action in response to the world. 


See Nivedita Chatterjee and Sitabhra Sinha, Understanding the mind of a worm : hierarchical network 
structure underlying nervous system function in C. elegans (2008). Authors note: “This [assortativity, a 
kind of hierarchy in networks] may shed light on one of the central questions in evolutionary biology that 
resonates strongly with the theme of this volume, namely, why did brains or central nervous systems 
evolve? ] 

14 In I am a Strange Loop (2007) 

15 That would be a mechanism of fast thinking (Keith Stanovich, David Kahneman, Amos Tversky), 
but I have a bone to pick with the division of thinking into fast and slow, see Pattern chemistry of 
rationality (Essay 58). 






15 


Although I am not going to discuss it here, in Figure 6 I reproduce Figure 54.4 from my 
The New and the Different (1996, Chapter 54, The Mathematical Mind, p. 413), where more 
detail about the analyzer and synthesizer can be found. 

The core of the mind, obviously, is the realm of 
ideas, not of point sensations that come from the 
outside. The trash compactor leaves 
predominantly higher levels of the scale, which 
are less numerous and are not anchored at 
particular sensitive cells of the organism. Some 
of lower level ideas hide in the subconscious and 
probably not all of them can be expressed in 
words. 


This is how the analyzer works. Next question, 
what to do with its output? This information must 
have some use and the arrows should find their 
targets. 

The second part of the mind, the synthesizer, is a 
diverging scale, anti-symmetric to the analyzer. It uses the compacted processed 
information and converts it into elementary responses of the cells, mostly of muscles, 
including speech, writing, and work, as well as automatic reflexes poorly controlled by 
the mind. It is diverging because the number of both sensors and affectors is large as 
compared with most abstract ideas, although this is only an uneducated guess. 

The synthesizer accepts the output of the converging “h-analyzer ” and in an IF—THEN 
manner converts it into the behavior in the same way as in any stage of derivation. What 
is different, this derivation is diverging into an array of elementary physical movements 
or their series. For a sufficiently complex organism, a large number of incoming 
configurations is analyzed into a smaller number of categories and the result used as a 
signal for a hierarchy of large number of outgoing elementary movements. The 
derivation matrix is the standard block for both. 

Having once invented a trick, nature uses it again and again as pattern in very different 
areas. 



synthesizer. 




















16 


SO, HOW DOES THE MIND REALLY WORK? 


We can better understand an exystem by complementing its history with the history of 
its understanding. In this regard, I would like to compare the h-hypothesis with the two 
perfectly indisputable leading ideas presented by Steven Pinker in his How the Mind 
Works. My understanding of both, however, could be disputed. 

First idea: the mind works because there is a correspondence between ideas, their 
symbols, and changes in the state of matter. The Turing machine and its incarnations 
process information (“ideas”) by manipulating symbols and physically writing and 
erasing them. Brain is the “matter” of the mind. I believe I follow this principle by 
wiring up the scale and avoiding the mathematical symbolism. 

Second idea: the mind, like life in general and other functions of our bodies, emerged in 
the process of the natural selection of replicators, which is the modern way to say 
“Darwinism.” This is my favorite idea, too, regarding mind, but from a different angle. 


My first remark is that any Turing machine must have a set of 
instructions or “reflexes,” as Steven Pinker notes. No picture 
of mind, however, will be complete without the explanation 
how this set comes to existence without the pre-existing 
condition of another human mind present. Now, how does 
that other mind work? Turing machine had Turing. Whom 
do we have? 




17 


My second remark is that since natural selection of replicators is the cardinal pattern of 
evolution, it would be natural to generalize it further and apply not just to life forms and 
memes 16 but to thoughts in the individual mind. This is the core of my approach. In my 
computer simulations I used a particular model of competition for a limited resource: 
Manfred Eigen’s concept of molecular evolution 17 . Ulf Grenander’s GOLEM, however, 
is more general because it uses casting a random number to determine the 
winning thought. In my view, casting a random number is a competition for 
a limited resource of the sum of all probabilities (always 1). It has one 
winner. But what happens with thoughts next in line of decreasing 
probabilities? I see them living in the subconscious, the less probable the 
deeper. 



I reproduce in Figure 7 a figure (Figure 8) from The Three Little Pigs : Chemistry of 
language acquisition. 



Figure 7. Consciousnes as the winning configuration 
in competition for a limited resource 


ANSWER 

Mind is an exystem. It works by maintaining a probability distribution of competing 
thoughts. The winner (or a few leaders) rise to the consciousness, the rest descend to the 
subconscious in order of decreasing probabilities. The probability distribution at a 
particular moment depends on the distribution at the preceding moment. Thus, the 
question “Whom do we have?” creates—slow or fast—a distribution which pushes up 
into the consciousness this answer. (I confess, it was ready long before). Until there is a 
computer that thrives on randomness and can be foolish, careless, forgetful, and deceitful, 
there is still a gap between thinking machines and human mind. 


16 For example, the medieval memes spread by some right wing Republican congressmen. 

17 M. Eigen, Selforganization of Matter and the Evolution of Biological Macromolecules, Die 
Naturwissenschaften, 58 , 465-522 (1971). 

















18 


CONCLUSION 


My not so hidden agenda with this essay was to show that the cardinal problem of the 
origin of complexity, mind and life included, may have a simple solution. Such mundane 
property as the size of structure plays a role in cognition as large as the size of a company 
or just a mountain of money in economy. For thinking, however, the smaller, the better. 
This is all hypothetic, of course. 

I suspect that h-hypothesis could be relevant for the origin, structure, and function of 
DNA (which is, ultimately, a ki nd of Mother Nature’s long and slow cooked thought), 
but this area is too big and too distant for me. Can anybody try? Here it is in the 
nutshell: the origin of DNA can be described using h-hypothesis. This would explain the 
role of the DNA Dark Matter (former “junk”) in the chromosomes by its history. 

Pattern Theory, as mathematics in general, does not know borders between domains of 
the world and domains of our knowledge about it. As simplicity is partly measured by 
size, I am satisfied with these 18 pages (if with anything at all). 

Pattern Theory is the bridge between sciences and humanities. I believe my free-wheeling 
style naturally fits the borderless world. 


“1 approve this message” is humming in my mind, induced by the din of the election campaign, a 
show based not on the Aristotelian TRUE/FALSE logic, but on the weights and thresholds 
expressed in decibels of insanity. 18 It is just another step of combinations and derivations, some 
of which are potentially fatal for nations. 



18 See Pattern chemistry of 2012 Elections , Essay 57 










19 


APPENDIX 1 


MAIN SOURCES FOR PATTERN THEORY 

Ulf Grenander, General Pattern Theory: A Mathematical Study of Regular Structures, 
Oxford University Press, 1994. 

Ulf Grenander, Elements of Pattern Theory, The Johns Hopkins University Press, 
1996. 

Ulf Grenander, A Calculus of Ideas: A Mathematical Study of Human Thought, 
World Scientific Pub Co Inc., 2012. 

Numerous sites on the Web. 


PATTERN THEORY AND PATTERN CHEMISTRY: 

Yuri Tarnopolsky, COMPLEXITY, http://spirospero.net/complexity.html 

In particular: 

Molecules and Thoughts: Pattern Complexity and Evolution in Chemical Systems and the 

Mind 

The Three Little Pigs : Chemistry of language acquisition 

Pattern Chemistry of Thought and Speech and their Hypothetical Ancestor 

TIKKITIKKI TEMBO: The Chemistry of Protolanguage 

Also: 

Essay 57. THE FEW AND THE MANY 

Essay 58. ALL RATIONAL MINDS ARE ALIKE; EACH IRRATIONAL MIND IS 

RATIONAL IN ITS OWN WAY. Pattern chemistry of rationality 












20 


APPENDIX 2 


The idea of this essay is dated by 1956-1957, when I was a student of chemistry at 
Kharkov Polytechnic Institute. It was the second oldest technical university in the former Russian 
Empire. Its “institute” means the same as the “institute” in MIT. 

It was the times of Nikita Khrushchev’s “thaw,” the end of Stalinism, and the end of 
prohibition on “bourgeois” pseudo-science of cybernetics. The first ever public lectures 
on cybernetics, organized by Yuri Sokolovsky, a professor of the local academy of 
military communications, attracted a lot of attention. He also ran a seminar at my 
Institute for the staff of the department of Electrical Technology and I was the only 
(future) chemist there and, probably, the only student. The problem for discussion was 
“the reading automaton,” a device for recognizing text, now known as OCR, optical 
character recognition. I offered an idea and gave a presentation which did not cause any 
stir. 

My “reading automaton” would translate combinations of non-trivial elements of the 
letter, like the ends and the elbow of letter L or the sharp angles and the ends of letter N, 
into the name of the letter using the principle of the scale of sets. I still do not know if it 
would work, but I saw that the principle was much more universal than character 
recognition. 19 

Norbert Wiener’s Cybernetics , published in Russian in 1958, impressed me as much as 
the chemical experiments, which I saw at the age of 13, that made me a chemist. I began 
to read more on mathematical logic, set theory, and discovered the scale of sets. 

After decades of watching development of computer and cognitive sciences, staying in 
contact with Ulf Grenander for many years, and watching his work on Pattern Theory of 
the mind, I still believe that my idea makes sense because it is mind-sterile—a kind of an 
oxymoronic pun. This is why mindless is not a good term. 


19 Yuri Tarnopolsky, The New and The Different 




21 


APPENDIX 3 




Figure 8 Connectivity in the brain 


Figure 8 is reproduced from: Marcus Kaiser, A tutorial in connectome analysis : Topological and 
spatial features of brain networks. Neuroimage, Volume 57, Issue 3, 1 August 2011, Pages 892-907 , 
Fig. 4 . Marcus Kaiser uses this source . 


Abbreviations label some of anatomical subregions of the brain. The cortex has the 2D 
topography and distances between small regions widely differ. Part A shows the 
topology of connections between the areas regardless of the distance. Part B 
(dendrogram) reflects the metric distance between the subregions. For example, MT and 
IP are close, SP is somewhat farther from both. Areas ST down to BSTS are far from 
areas SF down to LOF. Both show further division. 


I see a very oblique—less oblique from the pattern view—confirmation of the principle 
of combinatory derivation under constraints of natural selection in the detailed (a lot of 
subtleties!) research of connectivity in C. elegans., Figure 9 is reproduced from: Varshney 
LR, Chen BL, Paniagua E, Hall DH, Chklovskii DB (2011) Structural Properties of the Caenorhabditis 
elegans Neuronal Net work. PLoS Comput Biol 7(2): el001066. doi:10.1371/journal.pcbi.1001066. 

Since I am not a professional, my understanding and interpretation of this work could be 
grossly wrong. 












































































22 


The map is an extremely detailed 90% complete connectivity (topological adjacency) 
matrix of two systems of contacts between neurons: electrical (blue circles) and chemical 
(red points) with the size of the mark reflecting the number of contacts per neuron. 



Figure 9. Adjacency matrices for the gap junction network (blue circles) and 
the chemical synapse network (red points) with neurons grouped by category 
(sensory neurons, interneurons, motor neurons). See original work , Fig.l, for 
details, most of which I do not touch. I added double “slash” (//) and normal 
“backslash” (\) diagonals. 

The important property of the map is that the neurons in each group—sensory, 
intemeurons, and motor—are enumerated in topographical order from head to tail. The 
map, therefore, reflects not only topology, but also topography. The matrices for 
sensory and motor groups are visibly sparser and are populated along the diagonal. It 








23 


means that the neurons there make contacts mostly with close topographic neighbors. 

The intemeurons, on the contrary, form a rich “small world” network. The 
communication between head and tail, naturally, is extremely limited. I want to draw 
attention to the “slash” (/) diagonal of the matrix. To me it shows the pattern similarity of 
the overall organization of the nervous system as result of their evolution: the sensory and 
motor neurons interact predominately with topographic neighbors. Analyzer and 
synthesizer are anti-symmetric. Their type of grid-like continuity, with topographic 
diversity, reproduces the organization of the continuous Euclidean external space, 
whether as object of perception or as the subject of action. The central nervous system, a 
descendant of its precursor in the worm, is where the two bundles intersect into a tight 
topological knot of a “near-complete” graph. This is the place where dreams and 
fantasies are born. 

The authors of the cited work noted in Discussion : “Several statistical properties [in 
particular, synapse multiplicity distribution] of the C. elegans network are similar to those of the 
mammalian cortex.” 

Hypothetically, the “combinatory derivation,” or scale, for which the term B-evolution 
(B for Bourbaki) could also be appropriate, explains how the mind evolves gradually, 
wavy, and smoothly, the way the little worm wriggles on videos posted by its well- 
deserved worshippers. 


APPENDIX 4 


8. SCALES OF SETS. STRUCTURES 

1. Given, for example, three distinct sets E, F, G, we may form other sets from them by 
taking their sets of subsets, or by forming the product of one of them by itself, or again by 
forming the product of two of them taken in a certain order. In this way we obtain twelve 
new sets. If we add these to the three original sets E, F, G, we may repeat the same 
operations on these fifteen sets, omitting those which give us sets already obtained; and 
so on. In general, any one of the sets obtained by this procedure (according to an explicit 



24 


scheme) is said to belong to the scale of sets on E, F, G as base. 


Thus being given a certain number of elements of sets in a scale, relations between 
generic elements of these sets, and mappings of subsets of certain of these sets into 
others, all comes down in the final analysis to being given a single element of one of the 
sets in the scale. 

FROM: Bourbaki, N. (1968). Elements of Mathematics: Theory of Sets. Reading, 

Mass.: Addison-Wesley. 


October 2012 


UNFINISHED DRAFT 

Last updated: October 30,2012 

March 1.2013 

March 11,2013 


email 








